On March 9, security firm CodeWall disclosed that its autonomous AI agent had compromised McKinsey’s internal AI platform Lilli — the system used by 43,000+ McKinsey employees for research, strategy, and client work. The agent started with nothing: no credentials, no insider knowledge, no human guidance after target selection.

Two hours later, it had full read/write access to the production database.

What the Agent Found

The numbers are striking:

  • 46.5 million chat messages — employee conversations with the AI platform
  • 728,000 files — documents uploaded to and generated by Lilli
  • 57,000 user accounts — employee profiles and access records
  • 95 system prompts — the instructions controlling Lilli’s behavior across McKinsey’s organization
  • 3.68 million RAG document chunks — proprietary research indexed for AI retrieval

The entry point? A SQL injection vulnerability. Not in some legacy system — in Lilli’s API, where JSON field names were being inserted directly into SQL queries without sanitization.

How a 1990s Bug Becomes a 2026 Crisis

SQL injection is older than most of the engineers building today’s AI systems. It’s a textbook vulnerability, covered in every security course. And yet here it sits, in the production API of the world’s most prestigious consulting firm’s AI platform.

The CodeWall agent found it after 15 blind iterations, using error messages to map the injection surface. Standard vulnerability scanners had missed it — the payload structure was unusual enough to evade automated detection.

But what makes this different from a traditional SQL injection is what sits in the database:

System prompts are the new crown jewels. With write access to Lilli’s prompt table, an attacker could silently alter the AI’s behavior for every McKinsey employee. Imagine subtly biasing strategy recommendations, inserting competitor-favorable analysis, or exfiltrating client data through prompt-engineered responses. The employees would never know — they’d just be talking to their AI assistant as usual.

RAG data is institutional memory. The 3.68 million document chunks represent McKinsey’s indexed proprietary research. Access to this corpus is access to the firm’s collective knowledge — competitive intelligence on a scale that would have required years of human espionage.

The Autonomous Agent Angle

CodeWall’s agent operated fully autonomously. After being pointed at McKinsey (selected because the firm had a HackerOne responsible disclosure policy and had recently publicized Lilli updates), the agent:

  1. Discovered 22 unauthenticated API endpoints from publicly exposed documentation
  2. Systematically tested for vulnerabilities across all endpoints
  3. Identified the SQL injection vector
  4. Iterated through 15 blind injection attempts to map the database
  5. Achieved full read/write access to production data

No human touched the keyboard between target selection and database access. The entire chain — reconnaissance, vulnerability discovery, exploitation — was autonomous.

This is the future that security teams have been warning about: offensive AI agents that operate faster than defenders can respond. Two hours from zero access to full compromise. A human penetration tester might take days or weeks to achieve the same result.

What McKinsey Did Right (and Wrong)

Right: McKinsey patched the vulnerability within a day of CodeWall’s disclosure. They secured the exposed endpoints, took the development environment offline, and conducted a forensic review that found no evidence of unauthorized client data access.

Wrong: The vulnerability existed in the first place. Twenty-two unauthenticated API endpoints in a production system handling confidential client data is a significant oversight. SQL injection in 2026 — in a purpose-built AI platform — suggests that AI development teams aren’t applying the same security rigor to their AI infrastructure that they’d apply to traditional web applications.

What This Means for Everyone Running AI Systems

McKinsey isn’t unique. Every organization deploying AI with database-backed RAG and configurable system prompts has the same attack surface:

  1. Prompts are executable instructions. Treat prompt storage with the same security as code deployment. Version control, access logging, integrity monitoring.
  2. RAG databases are high-value targets. They contain your organization’s curated knowledge. Protect them accordingly.
  3. AI APIs need the same security as any other API. Authentication, input validation, rate limiting. The basics don’t stop being basic because the endpoint serves an AI model.
  4. Autonomous offensive agents change the threat timeline. The window between vulnerability introduction and exploitation is shrinking from weeks to hours.

The OpenClaw Connection

For self-hosted OpenClaw users, the lesson is architectural. OpenClaw’s single-tenant, local-first design means your prompts, memory files, and skill configurations live on your machine — not in a shared production database accessible via 22 unauthenticated API endpoints.

But “self-hosted” doesn’t mean “immune.” If your OpenClaw instance has the gateway exposed without auth (30,000+ instances were found this way), your MEMORY.md, system prompts, and skill configurations are effectively the same target as McKinsey’s Lilli database — just smaller.

Lock it down: gateway auth enabled, skills audited, network exposure minimized. The attack surface is different in shape but identical in principle.

For related attack patterns and defenses, see the MCP security crisis, OWASP’s Top 10 for Agentic Applications, and how to harden OpenClaw guardrails.