For the past year, we’ve covered AI agents as security targets — prompt injection, MCP vulnerabilities, supply chain attacks, browser hijacking. But in early 2026, something shifted. AI agents crossed from being the hunted to being the hunters.
OpenAI’s Codex Security scanned 1.2 million commits in its first 30 days and discovered 14 CVEs across projects like OpenSSH, GnuTLS, Chromium, PHP, and GnuPG. Anthropic’s Claude Opus 4.6 independently found 22 Firefox vulnerabilities. Cloudflare used an AI coding agent to perform a self-vulnerability analysis and uncovered CVE-2026-22813, a CVSS 9.4 remote code execution flaw in markdown rendering.
AI agents aren’t just finding bugs. They’re finding bugs that human security teams missed.
OpenAI Codex Security
Codex Security evolved from OpenAI’s internal tool “Aardvark,” tested in private beta since late 2025. It launched in March 2026 as a research preview for Pro, Enterprise, Business, and Edu customers.
How it works:
- Repository analysis — Ingests your repo and builds an editable threat model based on the code’s architecture
- Vulnerability discovery — Uses frontier-model reasoning plus sandboxed environments to pressure-test suspected issues, sometimes generating proof-of-concept exploits
- Fix proposals — Suggests patches aligned with the codebase’s actual behavior
The numbers in 30 days:
- 1.2 million commits scanned
- 792 critical issues found
- 10,561 high-severity issues found
- 14 CVEs filed across major open-source projects
- 50%+ reduction in false positives vs. traditional SAST tools
- 84% reduction in redundant/noisy alerts on some repos
The affected projects include some of the most audited software on earth: OpenSSH, GnuTLS, Chromium, PHP, GnuPG, libssh, GOGS, Thorium. These are codebases with decades of human security review. An AI agent found vulnerabilities that survived all of it.
Claude Opus 4.6 vs. Firefox
Anthropic took a different approach. Rather than building a dedicated security product, they pointed their most capable model — Claude Opus 4.6 — at Firefox’s source code as a security analysis agent. It found 22 vulnerabilities.
The significance isn’t just the count. Firefox is a mature, security-hardened browser with a dedicated security team, fuzzing infrastructure, and a history of extensive third-party audits. Finding 22 issues that existing processes missed suggests that AI agents can see patterns that traditional tools and human reviewers don’t.
Cloudflare’s Self-Analysis
Perhaps the most interesting case: Cloudflare used an AI coding agent to analyze its own code. The agent discovered CVE-2026-22813 — a critical (CVSS 9.4) remote code execution vulnerability in a markdown rendering pipeline.
This matters because markdown rendering is ubiquitous in AI agent systems. Every chat interface, every documentation tool, every agent that processes markdown content could be affected by similar flaws. An AI agent found a vulnerability in the exact type of component that AI agents depend on.
Why AI Agents Are Good at This
Security vulnerability research has characteristics that play to AI strengths:
Pattern recognition at scale. An AI agent can analyze millions of lines of code and recognize subtle patterns — a missing bounds check, an unsafe type conversion, a race condition — across the entire codebase simultaneously. Human researchers focus on specific areas; agents scan everything.
Cross-reference capability. Codex Security doesn’t just look at the code in front of it. It builds a threat model of the architecture, then validates suspected vulnerabilities by reasoning about how different components interact. This is the kind of holistic analysis that’s expensive and slow for human teams.
No fatigue. Security audits are mentally exhausting work. Attention fades over time, especially when reviewing large codebases. An AI agent’s analysis quality doesn’t degrade at hour 40 the way a human’s does.
Novel perspectives. Human security researchers develop expertise through experience, which also creates blind spots. An AI agent trained on the full corpus of security research approaches code without the assumption that “this pattern is probably fine because I’ve seen it a thousand times.”
The Irony
We’ve spent months documenting how AI agents are vulnerable — to prompt injection, to MCP attacks, to supply chain compromises, to browser hijacking. Now those same agents are finding vulnerabilities in the systems they interact with.
This creates an interesting feedback loop:
- AI agents operate in environments (browsers, servers, codebases) with vulnerabilities
- AI agents can now find those vulnerabilities better than previous methods
- Finding and fixing those vulnerabilities makes the environments safer for AI agents
- Which lets AI agents take on more security-sensitive tasks
- Which expands both their attack surface and their defensive capability
What This Means for OpenClaw Users
OpenClaw is model-agnostic, which means you can leverage these security capabilities today:
Run security scans with powerful models. Point Claude or GPT-5.4 at your code via OpenClaw’s coding-agent skills. The same models finding CVEs in OpenSSH can review your projects.
Automate security audits on a schedule. Use OpenClaw’s cron system to run periodic security scans. Set up a nightly agent that reviews recent commits and flags issues.
Layer multiple models. Use one model for initial scanning and another for validation. OpenClaw’s routing makes this trivial — different tasks to different models based on their strengths.
Self-audit your OpenClaw setup. The clawdbot-self-security-audit skill can check your configuration for common misconfigurations. Add a frontier model review on top for deeper analysis.
The Landscape Shift
The security industry is adjusting to a world where AI agents are simultaneously:
- The attack surface (MCP vulnerabilities, prompt injection, agentic browser hijacking)
- The attack tool (AI-generated phishing, automated exploitation)
- The defense tool (vulnerability discovery, code review, threat modeling)
- The security researcher (finding novel CVEs in production software)
This isn’t a contradiction — it’s the same dynamic we’ve seen with every powerful technology. The question isn’t whether AI agents will be used for offense and defense. It’s whether the defensive application outpaces the offensive one.
Based on early results — 14 CVEs in 30 days from Codex Security alone — the hunter side is showing promise.
Related: MCP Security Crisis, GlicJack: Chrome Gemini Hijack, How to Set Up Guardrails for Your OpenClaw Agent.