EP 618: RSL vs. the AI Scrape: Can LLM licensing save the open web?
🎯 Summary
Summary of Everyday AI Show: Really Simple Licensing (RSL) and Saving the Open Web
This episode of the Everyday AI Show, hosted by Jordan Wilson, features an in-depth discussion with Doug Leads, Co-founder of Really Simple Licensing (RSL), focusing on a critical threat to the open web ecosystem: the impact of generative AI models consuming content without compensation or clear licensing frameworks.
1. Main Narrative Arc and Key Discussion Points
The central narrative revolves around the existential threat posed by AI models (like Google’s AI Overviews, ChatGPT, and Perplexity) that provide direct answers, bypassing the need for users to click through to original source websites. This shift starves publishers of necessary traffic, undermining their ability to monetize content (via ads or subscriptions) and ultimately threatening the sustainability of high-quality, human-created content that fuels the AI models themselves. RSL is presented as the necessary protocol and organizational structure to solve this imbalance.
2. Major Topics, Themes, and Subject Areas Covered
- The AI Content Consumption Crisis: The shift from link-based search to direct AI answers and its negative impact on publisher traffic and revenue.
- Licensing and Copyright: Establishing machine-readable standards for content usage rights.
- The Role of Collective Rights Organizations: Modeling a solution after music industry bodies (like ASCAP) to streamline licensing negotiations.
- Publisher Adoption and Momentum: Highlighting early supporters and the strategy for achieving critical mass.
- Benefits for AI Companies: How licensing legally accessed, high-quality data can reduce compute costs and improve answer quality (reducing hallucinations).
3. Technical Concepts, Methodologies, or Frameworks Discussed
- Really Simple Licensing (RSL) Standard: An open, machine-readable standard (found at RSLstandard.org) that allows content creators to articulate their license terms directly within existing web infrastructure, specifically referencing robots.txt. This is contrasted with the limited functionality of
robots.txtalone. - RSL Collective: A collective rights organization built on top of the RSL standard. It acts as a central negotiating body for AI companies seeking blanket licenses for content from participating publishers.
- Analogy to RSS: Doug’s partner, Eckerd Walther (co-creator of RSS), was instrumental in developing RSL, drawing parallels between creating a web standard for content syndication and creating one for content licensing.
- Generative Web, RAG, and NL Web: RSL is designed to be compatible across various AI consumption methods, including Retrieval-Augmented Generation (RAG).
4. Business Implications and Strategic Insights
- The Unsustainability of “Stealing”: The current model of AI companies taking content without payment is unsustainable, leading to litigation (e.g., the Propel settlement) and a future lack of quality training data.
- Streamlining Negotiation: The RSL Collective solves the “million negotiations” problem for LLM companies by offering a single blanket license, similar to how streaming services license music rights.
- Publisher Options: Publishers currently face three choices: sue, partner independently, or go out of business. RSL offers a structured, collective partnership path.
- Quality vs. Cost: Licensing high-quality, human-written content via RSL allows AI companies to bypass expensive, complex data mashing, leading to lower compute costs and higher-quality, less hallucinatory outputs.
5. Key Personalities and Thought Leaders Mentioned
- Doug Leads: Co-founder of Really Simple Licensing; former executive at Ask.com and instrumental in building the media group that became Dotdash Meredith.
- Eckerd Walther: Doug’s partner, co-creator of RSS (Really Simple Syndication), and the architect behind the RSL standard.
- Stephen Johnson: Co-founder of NotebookLM (mentioned during the sponsor break).
6. Predictions, Trends, or Future-Looking Statements
- Doug predicts that once critical mass is achieved via the RSL Collective, major LLM companies will engage in licensing deals because it offers a lower-cost, higher-quality data pipeline.
- The goal is to see the first licensing deal finalized “not too long” from the recording date.
- The future of high-quality content relies on creators being compensated, ensuring the AI ecosystem continues to have valuable data to train on.
7. Practical Applications and Actionable Advice
- For Publishers: Visit RSLcollective.org to register interest and join the collective. Joining is currently free, non-exclusive, and allows publishers to opt-out anytime. This signals interest to AI companies and authorizes the collective to negotiate on their behalf.
- For AI Companies: RSL provides a clear, scalable protocol to legally access and pay for the content needed to improve model quality and reduce operational costs.
8. Controversies, Challenges, or Problems Highlighted
- The “I Don’t Like It” Approach: The current default behavior of AI companies has been to take content without permission, forcing publishers into litigation or traffic loss.
- Critical Mass: The primary immediate challenge for RSL is building sufficient publisher adoption so that the collective represents enough valuable data to make negotiating with AI companies worthwhile for the LLM providers.
- Enforcement: The collective model includes collective enforcement, meaning if a member is infringed upon, the entire collective can pursue legal action, offering stronger protection than individual publishers could manage.
9. Context: Why This Conversation Matters to the Industry
This conversation is vital because it addresses the fundamental economic and
🏢 Companies Mentioned
đź’¬ Key Insights
"What happens when you do that, billions of dollars and compute cost, and subpar answers that have their own hallucinations? Instead, if your content—if they have access to content that actually answers a question—just give that content... You don't have to worry about hallucinations."
"That to me makes me think of the early days of music on the web. Napster was taking everybody's music... But it was not going to be sustainable in that model because they weren't paying for anything. And to Apple's credit, they went to ASCAP and said, 'We need a license structure where we can actually license all the music and pay for it.'"
"We need an ecosystem that's going to survive AI and allow creators of content to get paid. Otherwise, there will not be content to train AI on, and there won't be AI."
"We can create an open standard that allows anybody to articulate their license terms for any piece of content. We can put that right in robots.txt, and crawlers can see it and they can say, oh, I now know what the licensing terms are, and I can then go away if I don't want to do it, or I can agree to those licensing terms."
"But with AI, I just get the answer. No need to go to the site. Great user experience. Terrible for the ecosystem because without that traffic as a publisher, I can't pay for the content that I'm producing."
"The problem with that is it hurts the ecosystem that the whole web, the open web, is built on, which is take some of my content, index it, and then give me a link. So when someone looks for something that I have, you send them to me and I get the traffic as the publisher, as the creator."