Building AI Systems That Think Like Scientists in Life Sciences - Annabel Romero of Deloitte
🎯 Summary
Podcast Summary: Building AI Systems That Think Like Scientists in Life Sciences with Annabel Romero (Deloitte)
This 31-minute episode explores the profound impact of AI, particularly large language models (LLMs) and structural prediction tools like AlphaFold, on the life sciences, moving beyond human language to interpret the “language of evolution” encoded in DNA and proteins. Annabel Romero, a specialist leader in AI for drug discovery at Deloitte with a background in structural biology, emphasizes how these systems are accelerating scientific discovery by identifying complex patterns previously invisible to human analysis.
The core narrative revolves around framing biological data—DNA (nucleotides) and proteins (amino acids)—as inherent languages. Proteins, with their 20-letter alphabet, form complex “sentences” (structures) whose function is dictated by their form. AI models trained on vast, conserved biological sequences are now capable of reading these patterns, significantly speeding up tasks that once took decades, such as determining protein structures.
1. Focus Area
The discussion centers on the application of Large Language Models (LLMs) and Generative AI (Diffusion Models) in Life Sciences R&D, specifically focusing on drug discovery, protein structure prediction (AlphaFold), and cross-disciplinary applications in agriculture and allergen research. A key theme is building AI systems that augment, rather than replace, scientific creativity and interpretability.
2. Key Technical Insights
- Biology as Language: Proteins operate using a language of 20 “letters” (amino acids), forming structures whose patterns (conservation across species) are now decipherable by LLMs, offering insights into function far faster than traditional bioinformatics.
- AlphaFold’s Role: AlphaFold is a critical tool that translates sequence data into predicted 3D structures, which is essential for understanding drug targets. However, its outputs require expert interpretation (like a detective solving a mystery) to avoid misinterpretation of context (e.g., whether the protein is isolated or part of a complex).
- Generative Protein Design: The next frontier involves using diffusion models to create entirely novel proteins (“Boregobinders”) designed to bind specific targets, moving beyond traditional small molecules or antibodies for therapeutic intervention.
3. Business/Investment Angle
- Accelerated Drug Targeting: The ability to rapidly model protein structures (human targets) drastically improves the speed and specificity of drug discovery, especially for rare diseases.
- Cross-Sector Synergy: The underlying AI methodologies used in drug discovery (e.g., small molecule design) are directly transferable to AgriTech (e.g., designing herbicides), treating the plant as a “patient.”
- Regulatory Requirement: Understanding the precise mechanism of action at the molecular level (often requiring structural knowledge) is increasingly mandated by regulatory bodies like the FDA for new drug approvals.
4. Notable Companies/People
- Annabel Romero (Deloitte): Specialist Leader in AI for Drug Discovery; provides the expert perspective bridging structural biology and AI strategy.
- AlphaFold: Highlighted as the breakthrough tool that demonstrated the power of AI in predicting protein structure from sequence.
- Atlas (Deloitte Initiative): The internal system being developed to integrate these diverse biological insights (LLMs, AlphaFold predictions, experimental data) to model biological complexity across multiple layers.
5. Future Implications
The industry is moving toward a future where AI acts as a sophisticated scientific partner, capable of solving complex biological mysteries (like allergen cross-reactivity between birch pollen and certain fruits) in seconds rather than decades. There will be a greater focus on generative biology, where custom therapeutic proteins are engineered via diffusion models. Crucially, the conversation stresses the need for interpretability and clear boundaries when deploying these powerful tools in regulated environments.
6. Target Audience
This episode is highly valuable for AI/ML professionals in the Life Sciences (Drug Discovery, Bioinformatics), R&D Leaders, Biotech/Pharma Executives, and Technology Strategists interested in the practical, high-stakes applications of Generative AI beyond consumer-facing products.
🏢 Companies Mentioned
đź’¬ Key Insights
"Absolutely. Really, really, I think we're in store for a very surprising 2025, 2026 on numerous fronts, not least of which are GenAI, AI, and some step-level changes that we're seeing in generative AI."
"can now target proteins with generated proteins using diffusion models."
"So you see now you're not limited to antibodies, small molecules for targeting proteins. You can now target proteins with generated proteins using diffusion models."
"There's something called the novel generation of proteins. People call them 'Boregobinders.' And this can have all sorts of ranges of sizes. They are based on diffusion models."
"With the use of large language models in Atlas, we were able to link all these allergens, that proteins, back to the birch challenge. One of the closest ones being apple or alder... Just with the large language models, we were just able to predict that. So you see how it's that jump from something that took forever to something that now in a matter of seconds we can figure it out."
"One of the things combining things like large language models for proteins, things like Alpha Fold, plus any other things that are useful for designing of small molecules, either for drug targeting or even to design like herbicide, that makes that bridge and that commonality between what you will normally see in drug discovery and agriculture technology."