20Product: How Scale AI and Harvey Build Product | Why PMs Are Wrong: They are not the CEOs of the Product | How to do Pre and Post Mortems Effectively and How to Nail PRDs | The Future of Product Management in a World of AI with Aatish Nayak
🎯 Summary
Podcast Summary: 20Product: How Scale AI and Harvey Build Product with Aatish Nayak
This 65-minute episode of 20 Product features Aatish Nayak, Head of Product at Harvey, discussing his experiences scaling product at hyper-growth AI unicorns like Scale AI and Harvey, and offering critical perspectives on the role of Product Managers (PMs) in the AI era.
1. Focus Area
The discussion centers on Product Leadership in Hyper-Growth AI Companies, covering product strategy, organizational scaling, the evolution of the PM role amidst AI advancements, effective product processes (pre/post-mortems, PRDs), and the critical importance of market selection. Specific AI applications mentioned include leveraging LLMs (like Claude 3.5 Sonnet) for specialized tasks like legal reasoning and the shift from data labeling to application-layer AI products.
2. Key Technical Insights
- LLM Evaluation for Specialized Tasks: Claude 3.5 Sonnet was specifically noted for its superior performance in long-form legal reasoning and drafting compared to previous models.
- Interface Evolution Beyond Chat: The current chat interface is viewed as the “command line starting point” (like MS-DOS) for the new AI frontier. Future product interfaces must move beyond linear, one-shot interactions to support complex workflows, potentially incorporating feedback loops like an AI coworker asking for draft revisions (an application of the IKEA Effect in UX).
- Reducing Customer-to-Code Distance: A key lesson from Scale AI was the necessity of minimizing layers between customer needs and engineering execution, often achieved by embedding engineers directly with customers or data contributors to capture nuance lost through human intermediaries.
3. Business/Investment Angle
- Market Selection is Paramount: Nayak strongly advocates that finding a huge, emerging market is the single most important factor for success, citing how Scale AI successfully pivoted its data labeling services across booming sectors (self-driving, robotics, government) to stay ahead of the curve.
- Great Markets Mask Execution Flaws: Conversely, exceptionally large markets (like on-demand robotaxis for Uber) can sustain companies through periods of internal chaos, highlighting the market’s power over initial execution quality.
- Distribution vs. Product Moat: While distribution (marketing, sales) drives initial traction (“king”), product substance is required for long-term stability and longevity (“president”), as users eventually see through superficial offerings.
4. Notable Companies/People
- Aatish Nayak: Head of Product at Harvey; previously held leadership roles at Scale AI (scaling from 40 to 800 employees) and Shield AI.
- Scale AI: Highlighted for its success in listening to frontier customers (e.g., Nuro) and its strategic pivots across different data-intensive markets.
- OpenAI: Mentioned in the context of early RLHF work and the pivot toward becoming a product company leveraging existing distribution (e.g., mobile apps).
5. Future Implications
The future of product management involves a shift away from the “CEO of the Product” mentality. PMs must become “WD-40”—lubricating organizational friction and enabling domain experts (engineers, designers) to interface directly with customers. In the AI landscape, longevity will be found not in foundational models, but in the UX and application layer built around them, requiring significant experimentation to define the next generation of interfaces beyond chat.
6. Target Audience
This episode is highly valuable for Product Leaders, VPs of Product, and experienced Product Managers operating in or adjacent to the AI/ML space, especially those navigating hyper-growth environments or transitioning from engineering to product management. It is also relevant for Venture Capitalists interested in market dynamics and execution quality in deep-tech startups.
🏢 Companies Mentioned
đź’¬ Key Insights
"Why do agents need humans more than humans need agents? ... Ultimately, I think the humans don't just always trust AI. They trust other humans using AI."
"I think realistically, you will run into cultural, legal, and regulatory barriers to actually wide-scale adoption of AI."
"For some of those types of extractions, Claude 3.5 Sonnet in particular is starting to get better. Every single model that's come out from OpenAI, from other competitors, we've benchmarked for since the beginning of time at Harvey. And 3.5 Sonnet is really where we're starting to see some of the performance better than OpenAI."
"And so we created this benchmark called Big Law Bench. It consists of tasks that are real billable work tasks that lawyers do on a daily basis at our biggest customers in big law firms. The nature of these tasks are, they're very much open-ended."
"To what extent are your evals the same as the public evals that are done on benchmarks? ... A lot of public legal benchmarks and then even things like, you know, Scale's humanities, last exam, they're all multiple choice. I would love if legal work was multiple choice, but any lawyer will tell you there are like a million options of what you could do."
"You will never have, in five years' time, you will not have a choice of which model, right? Yeah, exactly. And is user choice good? If you think about the paradox of choice, users get confused when they have too much choice. If you're just told, I think humans are lazy, just tell them which one."