879: Serverless, Parallel, and AI-Assisted: The Future of Data Science is Here, with Zerve’s Dr. Greg Michaelson

Unknown Source April 15, 2025 67 min

artificial-intelligence generative-ai ai-infrastructure startup investment nvidia microsoft google

🎧 Listen to Original

77 Companies

104 Key Quotes

5 Topics

2 Insights

1 Action Items

🎯 Summary

Podcast Episode Summary: 879: Serverless, Parallel, and AI-Assisted: The Future of Data Science is Here, with Dr. Greg Michaelson

This episode of the Super Data Science podcast features Dr. Greg Michaelson, co-founder of ZERV, discussing the significant evolution of the ZERV platform over the past year, focusing on how it addresses the challenges of modern, collaborative, and AI-augmented data science workflows. The conversation charts a course from artisanal notebook-based development toward an industrialized, platform-centric approach.

1. Focus Area

The primary focus is the industrialization of data science workflows through platform engineering, specifically highlighting ZERV’s graph-based coding environment (DAGs), its new serverless parallelization capabilities (“Fleet”), and the integration of AI assistants to augment coding and project creation. Secondary topics include the shift away from low-code/no-code tools, IP concerns regarding LLMs, and the flexibility in model hosting (OpenAI, Bedrock, Hugging Face).

2. Key Technical Insights

Serverless Parallelization (The Fleet): ZERV introduced the “Fleet” feature, enabling massive, cost-effective parallelization of code execution (e.g., running thousands of LLM calls simultaneously) by leveraging serverless compute spun up for each node in the Directed Acyclic Graph (DAG) without requiring manual multiprocessing management.
Node-Code Environment: ZERV operates as a code-first environment where nodes in the DAG represent code blocks (written in Python, R, or SQL) using the Monaco editor (VS Code’s engine). Data and memory flow are explicitly visualized between nodes, allowing for real-time previewing of inputs and outputs, eliminating the need for constant print() statements common in notebooks.
AI Agent Integration: Beyond simple GenAI blocks that execute prompts, ZERV features an AI assistant that acts as an agent capable of modifying the canvas itself—creating new blocks, connecting flows, and building entire analysis pipelines based on natural language instructions.

3. Business/Investment Angle

The Failure of Pure Low-Code/No-Code: The industry consensus, supported by Greg’s experience at DataRobot, is that low-code/no-code tools fail for complex problems, leading major platforms to pivot back toward supporting expert coders. Value generation remains concentrated among those writing code.
LLMs Threaten SaaS Models: The increasing viability of building in-house, customized solutions using LLMs (especially when self-hosted) poses a significant challenge to traditional SaaS businesses that rely on proprietary, black-box functionality.
Full-Stack Data Science Platform: ZERV positions itself as a full-stack environment covering the entire data science lifecycle—from data connection and exploration to model training (with GPU support) and deployment (via custom APIs, SageMaker, or job scheduling)—all within a single, collaborative application.

4. Notable Companies/People

Dr. Greg Michaelson: Co-founder of ZERV, bringing experience from DataRobot (CCO) and Travelers Insurance, emphasizing the need for industrialized data science.
ZERV: The platform central to the discussion, focusing on collaborative, DAG-based coding.
AWS Trainium 2 & Dell AI Factory with Nvidia: Sponsors mentioned, highlighting the importance of specialized hardware for large-scale AI.
OpenAI, AWS Bedrock, Hugging Face: Key providers discussed regarding LLM integration and data security/hosting choices.

5. Future Implications

The future of data science is moving toward industrialized, collaborative, and AI-assisted development. Platforms like ZERV are enabling teams to manage complex, multi-language workflows seamlessly, drastically cutting cycle times (up to 9x mentioned). The integration of AI agents will further democratize complex pipeline creation, allowing multiple human experts and their AI assistants to work concurrently on the same project canvas.

6. Target Audience

This episode is highly valuable for hands-on practitioners, including Data Scientists, ML Engineers, and Software Developers, particularly those involved in MLOps, platform engineering, or struggling with collaboration and version control in traditional notebook environments. Professionals interested in the strategic shift toward platform-based data science will also benefit.

🏢 Companies Mentioned

Titan ✅ ai_model_developer

GPT ✅ ai_model_developer

AWS QuickSight ✅ ai_infrastructure

Snowflake ✅ data_platform

Travelers Insurance ✅ enterprise_user

ChatGPT Pro ✅ unknown

Typing Mind ✅ unknown

OpenAI APIs ✅ unknown

ChatGPT Plus ✅ unknown

Sean Coshla ✅ unknown

But I ✅ unknown

Satya Nadella ✅ unknown

Microsoft CEO Satya Nadella ✅ unknown

Damn Bitcoin ✅ unknown

New York Times ✅ unknown

💬 Key Insights

"Luckily, at least I have that trace. Like when you do Deep Research, it's kind of an explanation of what it's thinking about. And so, in what it's thinking about, it was like, because I don't have internet access, I'm just going to assume what kind of thing was there. And I'm like, no, that's like, so at least I can see that it had that trace, because otherwise with the output itself, you're kind of like, oh, cool, that makes sense. And so, if I hadn't gone and looked at the website or looked at the trace, I would have been led completely astray."

Impact Score: 10

"I asked ChatGPT how, you know, for some code to do that. And it invented an API out of nothing, like that didn't exist. And it was like, oh, here's the code to do that. And I have no doubt that when they do introduce that feature that the code would run, but no, it was just made up. It just completely hallucinated an API because they want to be so helpful."

Impact Score: 10

"I start to notice it's so less frequently that I start to just trust the AI systems, which is maybe risky. So, when they were frequently inaccurate or having biases, as like, you know, you were constantly on the lookout for those kinds of issues. But now that I rarely see those... it just seems like they're basically always right."

Impact Score: 10

"Instead of wanting to pay a monthly fee that's just kind of this flat fee, you instead, we've kind of gotten used to now with calling APIs and paying per number of tokens that we send or receive back from that API. We as developers... are getting more and more used to this idea of an economic model where it's consumption-based on specifically what I need instead of subscription-based."

Impact Score: 10

"Microsoft CEO Satya Nadella predicts that business logic will move from SaaS applications into AI agents."

Impact Score: 10

"You mentioned earlier in this episode how AI could kill a lot of SaaS software and service businesses. So some companies like Klarna are combining AI standardization and simplification to shut down SaaS providers."

Impact Score: 10

📊 Topics

#artificialintelligence 129 #generativeai 23 #aiinfrastructure 6 #startup 3 #investment 1

🧠 Key Takeaways

💡 fill them in a bit more

💡 kind of explain that term a little bit

🎯 Action Items

🎯 potentially investigation