Is RL + LLMs enough for AGI? — Sholto Douglas & Trenton Bricken
🎯 Summary
Podcast Summary: Is RL + LLMs Enough for AGI? — Sholto Douglas & Trenton Bricken
This 144-minute episode features Sholto Douglas and Trenton Bricken (both now at Anthropic) revisiting their predictions from the previous year (2024) to assess the progress made in combining Reinforcement Learning (RL) with Large Language Models (LLMs) toward achieving Artificial General Intelligence (AGI).
1. Focus Area
The primary focus is the convergence of Reinforcement Learning (RL) and Large Language Models (LLMs), specifically examining how RL, particularly RL from verifiable rewards, is unlocking expert-level performance in complex domains. Key areas discussed include the development of agentic behavior, the challenges of long time-horizon tasks, and the comparison between domains that are easily verifiable (like competitive programming and software engineering) versus those requiring subjective judgment (like creative writing).
2. Key Technical Insights
- RL Success in Verifiable Domains: The biggest change in the past year is the conclusive demonstration that RL can push LLMs to expert, human-level reliability, particularly in domains with clean, verifiable reward signals (e.g., passing unit tests in software engineering or solving math problems).
- Limitation Shift from Reliability to Context/Scope: The primary bottleneck for agents is no longer just the “extra nines of reliability” (as previously thought), but rather the lack of context, memory systems, and the ability to handle complex, multi-file changes or long-horizon discovery tasks.
- RL vs. Pre-training Compute Allocation: There is a current imbalance where labs spend vastly more compute on base model pre-training than on subsequent RL fine-tuning. The speakers argue that RL is a crucial area for adding new knowledge and capabilities, suggesting compute allocation in RL will rapidly increase as algorithms mature.
3. Business/Investment Angle
- Software Engineering Agents Maturing: The speakers predict that by the end of the next year (2026), competent software engineering agents capable of performing a junior engineer’s day’s work will be commonplace, driven by the inherent verifiability of code tasks.
- Verifiability Dictates Acceleration: Domains with objective success metrics (like scientific discovery leading to patents, as exemplified by a drug discovery case) will see faster AI acceleration than subjective creative fields (like Pulitzer-winning novels).
- Compute vs. Scaffolding Trade-off: Companies are currently optimizing the trade-off between spending compute (letting the model “hit the typewriter” until it succeeds) versus spending dollars on human time to create bespoke, scaffolded training environments (curricula) for specific skills.
4. Notable Companies/People
- Anthropic: Both speakers are now affiliated with Anthropic.
- OpenAI (GPT-4/O3): Mentioned regarding their 10x compute multiplier increase from O1 to O3, indicating scaling RL compute post-initial release.
- DeepMind/AlphaGo: Used as a historical example of RL teaching agents new knowledge beyond human performance, provided sufficient compute and a clean signal.
- Future House (Sam Rodriguez): Mentioned for using LLMs to brainstorm and propose wet lab experiments that led to the discovery of a new drug.
5. Future Implications
The conversation suggests a near-term future where agentic software engineering becomes a reality, driven by clean feedback loops. The long-term path to AGI hinges on solving the long time-horizon problem and developing more general procedures for skill acquisition, rather than relying solely on bespoke, heavily scaffolded environments for every new task. The speakers also touch upon the possibility that current models are still under-parameterized relative to the human brain, limiting their ability to generalize efficiently.
6. Target Audience
This episode is highly valuable for AI researchers, ML engineers, AI product managers, and technology investors interested in the practical roadmap for scaling LLMs into autonomous agents and the technical bottlenecks currently facing the industry.
🏢 Companies Mentioned
💬 Key Insights
"compute becomes the most valuable resource in the world."
"if this scenario turns out true, then compute becomes the most valuable resource in the world."
"the one that I feel almost guaranteed to get and there's like an almost strong statement to me is one where like at the very least you get drop in like white-collar worker at some point in the next five years"
"we are dealing with alien brains here who don't have the social norms of humans or even a clear notion of like what they have and haven't learned that we have of them"
"another one that we've just read that it's worth making more explicit is this even if the AI models are not helping right the next training algorithm for their successor just the fact that if they had human-level learning efficiency whatever a model is learning on the job or whatever copy of the models learning on the job the whole model is learning so in effect it's getting like a thousand times less effici"
"I think language models are also just really weird right like with the emergent misalignment work. I don't know if they took predictions they should have like hey I'm going to fine-tune ChatGPT on code vulnerabilities is it going to become a Nazi and I think most people would have said no and that's what happened."