- Google’s AI system Big Sleep discovered 20 new vulnerabilities in open-source projects like FFmpeg and ImageMagick.
- The vulnerabilities were found and reproduced entirely by the AI, with human experts only involved in the final review.
- Developed by DeepMind and Project Zero, Big Sleep represents a major step in autonomous vulnerability discovery.
- Experts praise its design, but caution that AI-generated bug reports can sometimes produce false positives, requiring careful vetting.
Google has taken a major leap forward in the world of cybersecurity with the debut performance of its AI-powered vulnerability detection system, Big Sleep.
The company announced on Monday that the system had uncovered 20 previously unknown security vulnerabilities in popular open-source projects, including the multimedia processing library FFmpeg and the image-editing tool ImageMagick.
The vulnerabilities are still under review and have not yet been publicly disclosed, a common practice to give developers time to issue patches before exposing users to potential risk.
However, Google’s security team emphasized that the findings were legitimate and demonstrate the powerful potential of AI in the hunt for software flaws.
AI Found the Bugs, Not the Humans
Heather Adkins, Google’s Vice President of Security, explained that while a human security expert reviewed each finding before reporting it to the affected software maintainers, the vulnerabilities themselves were entirely discovered and reproduced by Big Sleep.
“This is the first time we’ve had an AI system perform the full discovery and reproduction process independently,” Adkins said. “The human role in this was purely for validation and communication. The heavy lifting was all done by Big Sleep.”
Big Sleep was developed by Google’s artificial intelligence division DeepMind, working in close partnership with Project Zero, the company’s elite vulnerability research team known for finding critical bugs across the web.
Behind the Code
Big Sleep is part of a growing class of AI tools trained to autonomously find software bugs. These large language model (LLM)-powered systems combine code analysis with reasoning abilities, allowing them to comb through vast open-source codebases in search of subtle and complex flaws.
Unlike more traditional static analysis tools or rule-based scanners, Big Sleep uses AI to simulate how code behaves in real-world scenarios, giving it a better chance of detecting deep logic issues that other tools may miss.
“This is not just about throwing computing power at the problem,” said Royal Hansen, Google’s Vice President of Engineering. “It’s about enabling machines to reason like experienced security researchers. We see this as a new frontier in automated vulnerability discovery.”
Not Alone in the Race, But Still Leading
Google isn’t the only player exploring AI-based security tools. Competitors like RunSybil and XBOW have also gained attention. XBOW recently topped a U.S. leaderboard on bug bounty platform HackerOne, reflecting the growing impact of these tools in real-world security research.
Vlad Ionescu, co-founder and CTO of RunSybil, acknowledged Big Sleep’s legitimacy and potential. “The team behind it clearly knows what they’re doing,” he said. “DeepMind provides the raw AI horsepower, and Project Zero brings years of experience in vulnerability research. That’s a powerful combination.”
Promise and Pitfalls of AI in Security
As promising as these tools are, they are not without their challenges. Open-source maintainers have raised concerns about a flood of AI-generated bug reports that often turn out to be hallucinations, false positives that waste developers’ time.
“That’s the problem people are running into,” Ionescu said. “We’re getting reports that look like gold but are actually just noise.”
Google says it is aware of these concerns and has designed Big Sleep’s process to minimize low-quality reports. Every finding is vetted by a human expert before submission, helping to maintain the credibility and usefulness of the reports.
Despite these growing pains, Big Sleep’s debut shows that AI can do more than assist—it can lead. As systems like Big Sleep mature, they could dramatically improve the speed and scale at which software vulnerabilities are identified and resolved, strengthening digital security for everyone.
Follow TechBSB For More Updates