Episode 25: How Data-Driven Growth Redefined a Media Giant

Unknown Source October 02, 2025 56 min
artificial-intelligence generative-ai ai-infrastructure anthropic meta
45 Companies
87 Key Quotes
3 Topics

🎯 Summary

Podcast Episode Summary: Episode 25: How Data-Driven Growth Redefined a Media Giant

This episode features an in-depth conversation with Sergey Fogelson, VP of Data Science at Televisa Univision, the world’s largest Spanish-language media company, detailing their journey of data unification, infrastructure build-out, and leveraging advanced analytics to fuel massive digital growth.


1. Focus Area

The discussion centers on Data Science Transformation within a Legacy Media Conglomerate. Key themes include data unification, building core data infrastructure (specifically the Household Graph), developing data products for internal clients and revenue generation, personalization strategies for the DTC streaming service (VIX), and the practical application of machine learning techniques like embeddings for audience understanding and content recommendation, all within the context of serving a massive, multicultural, Spanish-speaking audience.

2. Key Technical Insights

  • Televisa Univision Household Graph: A flagship infrastructural product built to represent the company’s “best guess” of households interested in Spanish-language media across the US, moving beyond simple ethnicity tracking to focus on content engagement signals.
  • Embedding Applications Beyond Text: The team successfully adapted algorithms traditionally used for text (like Word2Vec/GloVe) to create embeddings for sequential user behavior histories (e.g., sequence of shows watched). These embeddings represent entities like shows, actors, or geographic regions, enabling similarity matching for scaling niche audiences.
  • Consumption Behavior Over Content Similarity: For recommendation systems, the team prioritizes embeddings derived from how content is consumed (consumption behavior similarity) over purely content-based similarity (e.g., genre matching), as behavioral affinity has proven more effective at driving engagement.

3. Business/Investment Angle

  • Massive Digital Growth: The data modernization efforts directly correlated with significant business success; the digital business has grown at least 10x in the four years since Sergey joined, demonstrating the ROI of data infrastructure investment.
  • Balancing Internal vs. Revenue Products: A key strategic focus is creating data products that either simplify internal operations or directly drive company revenue, ensuring data science efforts are tied to tangible business outcomes.
  • DTC Success (VIX): The launch and scaling of the VIX streaming service (free and paid tiers) necessitated rapid development of personalization and messaging infrastructure, validating the investment in a unified data foundation.

4. Notable Companies/People

  • Sergey Fogelson (VP of Data Science, Televisa Univision): The central figure detailing the data transformation strategy.
  • Televisa Univision: The combined entity of Televisa (content creation powerhouse) and Univision (US distribution leader), serving over 250-300 million people globally.
  • VIX: The company’s successful Spanish-language direct-to-consumer streaming service.
  • Duncan Gilchrist (Delphina): Host/Interviewer, highlighting the practical wisdom in Fogelson’s approach—focusing on clear, small improvements rather than immediately jumping to complex tools like LLMs.

5. Future Implications

The conversation suggests that for large, established media companies, the next phase of data science involves moving beyond basic unification and modeling into deeper, nuanced personalization driven by behavioral embeddings. The focus will remain on leveraging proprietary first-party data to create unique audience representations (like the Household Graph) that cannot be easily replicated by competitors, ensuring cultural relevance and maximizing engagement across diverse linguistic and cultural segments.

6. Target Audience

This episode is highly valuable for Data Science Leaders, VPs/Directors of Data Engineering, Media/Entertainment Technology Strategists, and Professionals interested in scaling ML/AI within large, established enterprises. It offers practical wisdom on infrastructure building and applied ML rather than purely theoretical AI research.

🏢 Companies Mentioned

Roberto Mejri âś… big_tech
Ad-tech company (unnamed) âś… ai_infrastructure_data
Adobe âś… big_tech
Cloud Provider âś… unknown
Cloud Provider A âś… unknown
AI I âś… unknown
But I âś… unknown
Roberto Mejri âś… unknown
Champions League âś… unknown
The Clips âś… unknown
Factorization Machine âś… unknown
The CPMs âś… unknown
Personally Identified Information âś… unknown
Household Graph âś… unknown
Paramount Plus âś… unknown

đź’¬ Key Insights

"If you can't assign those [probabilities], you basically have to have your end users check everything, and that makes it so that it actually takes, in some cases, longer leveraging the outputs of the LLMs than it would have been for the user to just do the thing themselves, which frankly was the whole point of why everyone's so excited about generative AI, right? It's supposed to shrink our time to ship..."
Impact Score: 10
"I would just say that at this point, I think the way to really—to really take LLMs to the next level of usability there has to be some way that for specific use cases, there's some kind of a confidence assigned to the output that the LLM makes. There has to be some way to provide probabilities against the outputs."
Impact Score: 10
"And currently, there are no error bars that you can see when you have outputs from LLM infrastructure. There's just—it's just you get some output, and there's no way to say that the likelihood that this output, whatever it is... is in the LLM itself has some certainty around it."
Impact Score: 10
"LLMs just in general are really, really good at making their outputs appear incredibly confident. The problem is—so, as a person that has been doing what I would call classic data science and machine learning for the vast majority of my career... we really like understanding the level of uncertainty with a piece of output. I really like seeing error bars on things, right?"
Impact Score: 10
"I would say in this case, it was this way, but a good place to use generative AI because it didn't require—there was nothing at the end of the day that was directly surfaced to the user. It was behind, I would say, it was part of the infrastructure for our algorithm as opposed to something that was just directly being surfaced to the user..."
Impact Score: 10
"We found actually some significant improvement. It was on the order of 10 to 15 percent increases in overall engagement against the original model..."
Impact Score: 10

📊 Topics

#artificialintelligence 87 #generativeai 10 #aiinfrastructure 7

🤖 Processed with true analysis

Generated: October 06, 2025 at 04:09 AM