On March 5, Amazon’s North American marketplaces experienced a 99% drop in orders. Six-point-three million orders lost. Not from a DDoS attack. Not from a cloud region failure. From a code deployment that bypassed the company’s own safety checks.
Three days earlier, another incident caused 120,000 lost orders and 1.6 million website errors. That one was directly tied to Amazon Q, the company’s AI coding assistant.
Amazon SVP Dave Treadwell told staff in an internal meeting that a “trend of incidents” had emerged since Q3 2025, with “several major” disruptions in recent weeks. The company is now rolling out a 90-day safety reset — the clearest admission yet that AI-assisted coding is outpacing the review processes designed to catch errors.
What Actually Happened
Business Insider obtained internal documents detailing two major incidents:
March 2 — The AI incident Customers across Amazon marketplaces saw incorrect delivery times when adding items to their carts. Root cause: a code change where Amazon Q was “one of the primary contributors.” The internal review was blunt:
“GenAI’s usage in control plane operations will accelerate exposure of sharp edges and places where guardrails do not exist. We need investment in control plane safety.”
March 5 — The catastrophic outage A production configuration change was deployed without using Amazon’s formal documentation and approval process (Modeled Change Management). A single authorized operator executed a high-blast-radius config change with no guardrails. Orders dropped to near zero across North America for hours.
“No automated pre-deployment validation. Single authorized operator could execute a high-blast-radius config change with no guardrails.”
Amazon clarified that only one incident was AI-related, and none involved fully AI-written code. But the pattern is clear: AI tools accelerate code production, and existing review processes weren’t designed for that velocity.
The 90-Day Reset
Treadwell’s response targets approximately 335 “Tier-1 systems” — services that can directly impact consumers and have experienced multiple order-impacting incidents since last year. The new rules:
Mandatory two-person review. Every code change requires two people to review before deployment. This was already policy but was being bypassed, especially for AI-assisted changes that felt “quick.”
Formal change documentation. Engineers must use Amazon’s internal approval tool for all changes. No more deploying without a paper trail.
Automated reliability checks. New systems that strictly enforce Amazon’s central reliability engineering rules before code reaches production.
Leadership audits. All Tier-1 system owners, plus Director and VP-level leaders, must audit all production code change activities within their organizations.
Treadwell framed it as “controlled friction” — deliberately slowing things down to catch problems before they reach customers.
The Speed-Safety Paradox
This is the defining tension of AI-assisted coding in 2026.
Amazon SVP Dave Treadwell had previously mandated that developers use AI coding tools (including Kiro, Amazon’s more recent offering) for at least 80% of their weekly coding tasks. The goal was productivity. Engineers could write more code, faster.
The problem: review processes are human-bottlenecked. When AI tools generate code at 3x the previous rate, the review pipeline either backs up (killing the productivity gains) or gets rushed (creating the exact failures Amazon experienced).
Treadwell acknowledged this directly, writing that the new guardrails will combine “agentic” tools (AI-driven checks) with “deterministic” systems (rules-based, predictable checks). The implication: you need AI to review AI-generated code, because humans can’t keep up.
But there’s a recursion problem. If AI coding tools cause incidents, and you deploy AI review tools to catch them, who reviews the AI reviewers?
The Broader Pattern
Amazon isn’t alone. The entire industry is navigating this:
- GPT-5.4 launched with native computer use capabilities, enabling AI to write and deploy code with minimal human intervention
- Google’s Workspace CLI includes MCP server integration, creating more surfaces where AI-generated changes touch production
- Every major tech company has mandated AI coding tool adoption, creating the same velocity-versus-safety tension
The AWS cloud business wasn’t involved in these specific incidents, but the precedent matters. If AI-assisted code can take down Amazon.com, it can take down the infrastructure that runs a significant chunk of the internet.
What This Means for AI Agent Deployments
If you’re running AI agents that write or modify code — whether through OpenClaw, Codex, Claude Code, or any other tool — Amazon’s experience is instructive:
Review gates scale linearly, AI output scales exponentially. The bottleneck isn’t generating code. It’s validating it. Any system that relies on human review will eventually be overwhelmed.
“Quick” changes are the most dangerous. The March 2 incident involved AI-assisted changes that felt routine enough to skip full review. The March 5 incident involved a single operator making a “simple” config change. Both caused catastrophic failures.
Blast radius matters more than bug rate. AI-generated code might not have a higher bug rate than human code. But AI can generate changes that propagate broadly across systems in ways that weren’t anticipated — what Treadwell called “high blast radius changes.”
You need deterministic guardrails, not just AI guardrails. Treadwell’s “controlled friction” approach — mandatory documentation, two-person review, automated reliability checks — is fundamentally about adding deterministic constraints to a non-deterministic process.
For OpenClaw users who give agents tool access: this is the same principle as sandboxing, permission boundaries, and egress filtering. The more capable your agent, the more important the constraints around it.
Amazon learned this lesson at the cost of 6.3 million orders. The rest of the industry is watching.
For more on safe agent execution, read Amazon’s 90-day code safety reset memo, GPT-5.4 computer use and OpenClaw, and the OpenClaw guardrails guide.
Sources: Business Insider, TechHQ