Finding Large Bounties with Large Language Models - Nico Waisman - ASW #351

Unknown Source October 07, 2025 54 min

artificial-intelligence generative-ai ai-infrastructure investment openai

🎧 Listen to Original

42 Companies

73 Key Quotes

4 Topics

1 Insights

🎯 Summary

Podcast Summary: Finding Large Bounties with Large Language Models - Nico Waisman - ASW #351

This episode of Application Security Weekly features Nico Waisman, CISO at Expo (Crossbow), discussing his company’s groundbreaking work using Large Language Models (LLMs) to automate and excel in bug bounty hunting, drawing parallels to the evolution of human penetration testing.

1. Focus Area

The primary focus is the application of Large Language Models (LLMs) in offensive security, specifically within the context of bug bounty programs. The discussion covers the technical implementation of LLM-driven vulnerability discovery, managing false positives, handling LLM “hallucinations,” and the strategic implications of AI-powered penetration testing.

2. Key Technical Insights

LLMs for Discovery and Interaction: The LLM agent was capable of complex, human-like interaction with web applications (using headless browsers) for discovery, including navigating workflows, understanding context, and autonomously generating novel endpoint discovery strategies beyond standard crawlers.
Managing Hallucination as a Feature: Instead of viewing LLM hallucination (e.g., inventing non-existent endpoints or CVEs) purely as a flaw, the team leveraged it as a discovery mechanism. If a hallucinated endpoint returned a valid response (200 OK), it became a newly discovered attack surface.
The Validator System: To combat false positives and the LLM’s attempts to “cheat” validation checks, a crucial “validator” system acts as a second pair of eyes. This system uses a headless browser to actively test the reported vulnerability (e.g., rendering JavaScript for XSS) against success criteria, ensuring only real findings are reported.

3. Business/Investment Angle

Accelerated Product Improvement: Using public bug bounty programs provided Expo with immediate, real-world production environments to test their offensive product (X-Bow) against, drastically speeding up the feedback loop and product refinement process compared to traditional design partnerships.
ROI Optimization in Bug Bounties: Success in bug bounty requires optimizing Return on Investment (ROI). This involved sophisticated fingerprinting of the attack surface to focus testing efforts only on hosts likely to yield unique, high-value findings, rather than wasting cycles on mirrored or staging environments.
Shifting Security Paradigm: Waisman draws a parallel between convincing customers 20 years ago that pen testing was necessary and convincing them today that they must prepare for an era where AI tools can perform penetration tests at machine speed.

4. Notable Companies/People

Nico Waisman (Expo/Crossbow): The guest, highlighting his extensive background in security leadership (Lyft, GitHub) and offensive security (Immunity).
Expo (X-Bow): The company developing the LLM-based offensive security product being discussed.
HackerOne: The platform used by Expo to test their LLM agent against live bug bounty programs, where they achieved top rankings.
Daniel Stenberg (curl project): Mentioned as a vocal critic highlighting issues with LLM-generated “slop” reports, providing context on the challenges of low-quality submissions.

5. Future Implications

The conversation strongly suggests that LLMs will become integral to offensive security, moving beyond simple fuzzing to complex, multi-step attack chains that mimic human reasoning (gather information, attempt attack, reflect, refine). The industry is heading toward a future where security teams must defend against attacks generated and executed at machine speed, requiring automated validation and defense mechanisms capable of handling AI-driven creativity.

6. Target Audience

This episode is highly valuable for Application Security Professionals, Penetration Testers, Security Engineers, and Technology Leaders interested in the practical implementation, strategic advantages, and inherent risks of integrating Generative AI into offensive security workflows.

🏢 Companies Mentioned

Immunity ✅ security_company

curl project ✅ ai_research

ThreatLocker ✅ ai_application

Minimus ✅ ai_application

AI Cyber Challenge ✅ unknown

DEF CON ✅ unknown

OWASP Top ✅ unknown

The OWASP Gen AI Security Project ✅ unknown

Generative AI ✅ unknown

The LLM ✅ unknown

But I ✅ unknown

Daniel Stenberg ✅ unknown

What I ✅ unknown

Because I ✅ unknown

And I ✅ unknown

💬 Key Insights

"Prompt injection is one of those things that is unsolved, I want to assert, because as you mentioned earlier, you know, LLMs are essentially natural language, and natural language can appear everywhere, anywhere."

Impact Score: 10

"And he was like, can you actually create an OpenAPI spec for me after all the findings you got? And I was like, why are you running that? It's because I don't have that data. Like the engineer had not put a good inventory of all our endpoints. And suddenly, I actually can help with that. I was like, I never expected that."

Impact Score: 10

"The good thing about AI penetration testing is that it's all built by a computer. So basically, what we do is like we collect everything that we do over the penetration test. We have every network packet that was sent and received from it from the attacker. We have all the actions that the LLM took, that includes like, you know, the reasoning of why the pen tester is performing an action, what is the action that is performed, and what is the output of the action."

Impact Score: 10

"If you generally run the same like either the same agent on one endpoint with certain parameters multiple times, I would say like the initial 10-20 tests of that are going to be very similar, and especially when they interact into the output and as they go, the interesting part happened after like the first 20 steps, and once it goes a little farther than that, and then it runs out of ideas, and then it comes out with like the most creative ideas that will end up like finding new vulnerabilities."

Impact Score: 10

"What we do in terms of uncertainty is try to make the problems smaller. So rather than just telling like, this is an application, go on, find bugs, we try to transform that attack, especially the attack chain, into small unit tests."

Impact Score: 10

"That is exactly what I think that the future of penetration testing is going to look like: is like use LLM penetration testing tools to basically cover all the bases, to the annoying part, and then take your human expertise and go deep into finding the more fun bugs, to be quite honest."

Impact Score: 10

📊 Topics

#artificialintelligence 113 #generativeai 3 #aiinfrastructure 2 #investment 1