EP 595: Data First: The Strategic Playbook for AI Success

Unknown Source August 22, 2025 28 min
artificial-intelligence ai-infrastructure generative-ai google
22 Companies
43 Key Quotes
3 Topics

🎯 Summary

EP 595: Data First: The Strategic Playbook for AI Success - Comprehensive Summary

This episode of the Everyday AI Show, featuring Ashish Verma, US Chief Data and Analytics Officer at Deloitte, centers on the critical, often overlooked, foundation of successful AI implementation: a robust and forward-thinking data strategy. The core narrative emphasizes that as generative AI democratizes access to powerful models, data—not just the technology—becomes the primary differentiator and competitive moat.

1. Focus Area

The discussion focuses heavily on Data Strategy for AI/ML at Scale, specifically addressing the shift required by the advent of Generative AI and Agentic AI. Key themes include data procurement diversity, data governance for complex AI reasoning, and the necessity of creating scalable data access mechanisms.

2. Key Technical Insights

  • Data Diversity is Mandatory: Successful AI initiatives now require moving beyond internal data silos to incorporate third-party data, business partner data, and synthetic data to meet the demands of complex use cases.
  • Data Attribution and Labeling are Crucial for Agents: For Agentic AI, which involves autonomous reasoning and execution, the annotation and labeling (attribution) of training data must be meticulously correct to ensure agents operate within defined guardrails and avoid unexpected behavior (hallucinations are framed as a feature of probabilistic models that highlights poor data hygiene).
  • Unstructured Data as the New Frontier: A significant amount of valuable enterprise data (e.g., resumes, documents) is unstructured. AI success requires contextualizing and indexing this unstructured data (similar to how Google indexes the web) to make it queryable and useful for advanced reasoning tasks.

3. Business/Investment Angle

  • Data as the Competitive Moat: With state-of-the-art LLMs becoming commoditized, the quality, breadth, and governance of proprietary data sets are now the key competitive advantage for enterprises.
  • The Need for Data Marketplaces: To serve a large user base (like Deloitte’s 178,000 US employees) efficiently, companies must implement a “Data Marketplace”—a single, governed landing spot for all data types, moving away from manual data provisioning.
  • Ambition Must Match Data Readiness: A common failure point is when an organization’s AI ambition exceeds the quality, availability, or structure of its underlying data. Data strategy must proactively enable future ambition.

4. Notable Companies/People

  • Ashish Verma (Deloitte CDAO): The primary expert, detailing Deloitte’s internal strategy, including their two-and-a-half-year-old internal data marketplace and their approach to data concierge functions.
  • Deloitte: Used as a case study for implementing data strategy at massive scale, managing hundreds of millions of dollars in third-party data procurement, and preparing for agentic workflows.
  • Google (Sponsor Mention): Briefly mentioned in an ad break regarding their Gemini capabilities for video generation.

5. Future Implications

The conversation strongly suggests the industry is moving toward complex, multi-agent orchestration where systems interact autonomously. This future necessitates standardized open protocols for agent-to-agent communication and an unprecedented level of rigor in data governance, especially for data sourced outside the organization’s traditional “four walls.” The evolution of fields like life sciences (from biometrics to gene editing) is cited as an analogy for how data advancements fundamentally change entire industries.

6. Target Audience

This episode is highly valuable for Chief Data Officers (CDOs), Chief Information Officers (CIOs), AI/ML Strategy Leaders, Data Architects, and Business Leaders involved in scaling AI initiatives. It provides strategic guidance rather than deep coding tutorials.


Comprehensive Narrative Summary

The podcast establishes that the current AI boom, driven by accessible generative models, has exposed a fundamental weakness in many organizations: a lack of mature data strategy. Jordan Wilson and Ashish Verma argue that technology is no longer the bottleneck; data is.

Verma outlines his mandate at Deloitte: ensuring that the firm’s AI ambition is always supported by a commensurate data strategy. He stresses that modern AI requires a mosaic of data sources—internal, third-party, partner, and synthetic—far exceeding what traditional data governance models accounted for.

A key strategic solution discussed is the Data Marketplace, which Deloitte has operated for over two years. This marketplace acts as a central catalog, managing access policies and usage criteria for hundreds of data feeds. This structure is essential to avoid being overwhelmed by requests from thousands of internal users seeking multivariate data sets that require different compute environments (CPU vs. GPU).

The conversation pivots to Agentic AI, where the stakes for data quality become even higher. While human oversight can catch errors in LLM outputs, autonomous agents executing decisions require data attribution to be flawless to enforce operational guardrails. Verma highlights that the challenge of labeling and contextualizing data (business glossary, technical catalog) is magnified when agents interact with data originating outside the organization’s traditional boundaries.

Finally, Verma offers a sobering view on the current state of agentic orchestration, noting that while single-agent success is still being perfected, seamless multi-agent coordination remains a significant, unconquered frontier, underscoring that the journey toward truly autonomous AI is just beginning. The overarching advice is proactive investment in data infrastructure and governance now, before the ambition outpaces capability.

🏢 Companies Mentioned

Salesforce âś… ai_application
ServiceNow âś… ai_application
SAP âś… ai_application
So Ashish âś… unknown
Google AI Pro âś… unknown
Google Gemini âś… unknown
So I âś… unknown
And I âś… unknown
Chief Data âś… unknown
Analytics Officer âś… unknown
US Chief Data âś… unknown
Ashish Verma âś… unknown
The AI âś… unknown
Jordan Wilson âś… unknown
Everyday AI âś… unknown

đź’¬ Key Insights

"The question becomes if we don't participate in this, the portfolio of services that make us relevant today will make us irrelevant tomorrow because we didn't arrive at the time that AI arrived in the value chain."
Impact Score: 10
"In reality, if you look at sort of what that does to life sciences where, you know, disease pathology is now very different, right? Or going to be very different, but discovery is going to be very different, manufacturing, clinical trials, supply chain is going to be very different."
Impact Score: 10
"Anybody that is claiming that they've done this at scale and it works seamlessly, you know, we don't buy it, right? Because, you know, we do our own experimentation and we realize how hard it is, right? And we're just getting started on multi-agent orchestration."
Impact Score: 10
"if you start to look at attribution for the purposes of agentic AI, or if you look at attribution for the purposes of labeling for agentic AI, you'll pretty soon come to the conclusion that, you know, that is sort of one of the biggest drivers for why agent orchestration or registration or interoperability of agents becomes such an important component, which is why protocols like, you know, open standard protocols for agent to do it."
Impact Score: 10
"what you have to get right in essence is that the attribution on the data set that feeds that agent, you know, needs to be annotated correctly for you to be able to get that agent to sort of behave within the guardrails, the boundaries of what you're accepting and all what you're expecting the answer to be."
Impact Score: 10
"when it comes to agentic AI and when these systems are going to start using our dynamic data and start executing decisions on our behalf, I think it even more so prioritizes the importance of correct data."
Impact Score: 10

📊 Topics

#artificialintelligence 68 #aiinfrastructure 5 #generativeai 3

🤖 Processed with true analysis

Generated: October 04, 2025 at 08:27 PM