📆 ThursdAI - Oct 16 - VEO3.1, Haiku 4.5, ChatGPT adult mode, Claude Skills, NVIDIA DGX spark, Wordlabs RTFM & more AI news

Unknown Source October 17, 2025 95 min

artificial-intelligence ai-infrastructure generative-ai startup investment openai google microsoft

🎧 Listen to Original

120 Companies

166 Key Quotes

5 Topics

5 Insights

🎯 Summary

ThursdAI Podcast Summary: VEO3.1, Haiku 4.5, ChatGPT Adult Mode, and Scientific Breakthroughs (Oct 16)

This episode of ThursdAI covered a dense week of major announcements across open-source models, large commercial LLM updates, significant hardware developments, and a groundbreaking application of AI in scientific discovery. The overarching theme was the rapid maturation of AI capabilities, particularly in multimodal generation and specialized scientific reasoning.

1. Focus Area

The discussion centered primarily on Artificial Intelligence and Machine Learning, with deep dives into:

Large Language Models (LLMs) & APIs: Updates from Anthropic (Haiku 4.5), OpenAI (Adult Mode, Memory Management), and Microsoft (Windows 11 Copilot).
Video Generation: The release of Google DeepMind’s VEO 3.1 and updates to competitors like Sora Pro.
Open Source Models: New releases from Qwen (smaller VL models) and a major scientific breakthrough using Google’s C2S Scale 27B model.
Hardware: Announcements regarding NVIDIA DGX Spark and Apple’s M5 chip.
AI Agents & Coding: The launch of a free, ad-supported tier by AMP.

2. Key Technical Insights

C2S Scale’s Scientific Emergence: Google’s 27B Gemma-based model achieved a novel scientific discovery regarding cancer cell behavior by treating scRNA sequence profiles as a “language.” This was attributed to an emergent capability of scale, utilizing over a billion tokens of biological data processed through SFT and RL stages.
Qwen 3 VL Performance Leap: The new smaller Qwen 3 Vision-Language models (4B and 8B) demonstrated performance matching or exceeding previous large models (like Qwen 2.5 72B) on several benchmarks. Notably, the 8B model achieved a 33.9 score on OS World, significantly outperforming the 72B predecessor (8%) and rivaling larger models for on-device agent tasks.
VEO 3.1 Enhancements: The new video model focuses on cinematic updates and improved control, suggesting a move toward professional-grade video creation tools, alongside competitors pushing generation times past 20 seconds (Baidu’s News Streamer).

3. Business/Investment Angle

OpenAI’s Content Policy Shift: The planned introduction of “adult mode” signals OpenAI’s strategy to capture a broader user base by relaxing overly restrictive guardrails, potentially impacting market share against competitors who already offer more flexibility.
AI in Drug Discovery: The C2S Scale breakthrough validates the massive commercial potential of applying LLMs to complex biological data, suggesting a future where AI accelerates drug discovery through high-throughput virtual screening and hypothesis generation.
Hardware Competition Intensifies: NVIDIA’s DGX Spark targets low-power, high-density inference, while Apple’s M5 chip emphasizes significant on-device AI acceleration, indicating a bifurcation in hardware strategy between cloud-scale training/inference and edge/personal device efficiency.

4. Notable Companies/People

Google DeepMind: Highlighted for the VEO 3.1 video model (interview with Jessica Gallagos) and the revolutionary C2S Scale 27B model.
OpenAI/Sam Altman: Mentioned for the planned release of “adult mode” and updates to ChatGPT’s memory management.
Anthropic: Released Haiku 4.5, noted as being twice as fast as its predecessor.
Qwen Team: Praised for the extensive testing and release of smaller, high-performing VL models.
Cognition: Mentioned for the breaking news release of SweetGrap (discussed with Svik).

5. Future Implications

The conversation strongly suggests the industry is moving toward:

Specialized, High-Impact AI: Models trained specifically on domain-specific “languages” (like biological sequences) can yield profound, emergent scientific insights, moving beyond general-purpose reasoning.
Ubiquitous OS Integration: Microsoft embedding Copilot directly into Windows 11 signals a future where operating systems are primarily controlled via natural language commands.
Increased Model Accessibility: The release of powerful models like C2S Scale (27B) and smaller Qwen VL models means cutting-edge performance is becoming accessible for local deployment and specialized fine-tuning.

6. Target Audience

This episode is highly valuable for AI/ML Engineers, AI Researchers, Product Managers in Tech, and Technology Investors who need a rapid, comprehensive overview of the latest model releases, technical breakthroughs, and strategic shifts across the AI ecosystem.

🏢 Companies Mentioned

AI Engineer Conference in 2024 ✅ organization

GRPO ✅ ai_research_method

Riva P Ventures ✅ financial_entity

DeepSeek ✅ ai_model_provider

QwenCoder ✅ ai_model_provider

GPT-5 Codex ✅ ai_model_provider

Gemini ✅ big_tech

Sonnet 4.5 ✅ ai_application

World Labs 3D ✅ ai_startup

Lumiere ✅ ai_application

Wolfram ✅ ai_application

Sonnet 4 ✅ ai_company

GPT-5 Nano ✅ big_tech

C-Dream ✅ ai_application

NanoBanana ✅ ai_application

💬 Key Insights

"Ruler is basically like an automated LLM judge. We do a bunch of work to make LLM judge work really, really well for RL. And in practice, this works phenomenally well, surprisingly well, where... a very hard part in doing RL was getting that reward function..."

Impact Score: 10

"The Thinking Machines post about LoRAs... basically showed when you're doing RL at least, there is zero discernible difference in training all the parameters versus training a LoRA."

Impact Score: 10

"If you have a bunch of different LoRAs, and LoRAs are just, it stands for Low-Rank Adapter... you can actually at inference time still batch together inference requests from different requests that are using different LoRAs, and it can all be run efficiently in the same batch as if they were all running against the same model."

Impact Score: 10

"Fundamentally the way RL works is... you're going to have your deep research agent go off and research 100 different questions... Comes back with the final answer, and then you're going to take all those final answers and you're going to have to grade them some way... Then you're going to use that to train and update your model..."

Impact Score: 10

"Traditionally, if you're managing your own GPUs and starting everything up that way, you might take like a couple of minutes every time you start a run... With serverless RL... it just takes a couple of seconds to start up, and you can be off to the races."

Impact Score: 10

"If you've done as much as you can and you're sort of hitting a limit, it's like, hey, my agent is doing what I want 80% of the time and then there's like these 20% of the cases where it's not, that's where I think it makes sense to bring in RL and you can get that last 20% of the performance."

Impact Score: 10

📊 Topics

#artificialintelligence 174 #aiinfrastructure 52 #generativeai 27 #startup 5 #investment 2

🧠 Key Takeaways

💡 definitely talk about what adult mode means and why it's important

💡 absolutely acknowledge how much power this gives to just regular channels, just as me trying to do some stuff

💡 talk about the companies

💡 be able to cover topics that are not deemed sensitive by somebody else because this is how we want our AI to be