AI Coding • March 13, 2026 • 5 min read

Amazon Lost 6.3 Million Orders in a Single Outage. AI-Assisted Coding Was Part of the Problem.

A series of outages hit Amazon's e-commerce platform in early March, including one directly tied to its AI coding assistant Q. The company is now enforcing a 90-day safety reset with mandatory approval gates for 335 critical systems.

🦞

OpenClaw Team

On March 5, Amazon’s North American marketplaces experienced a 99% drop in orders. Six-point-three million orders lost. Not from a DDoS attack. Not from a cloud region failure. From a code deployment that bypassed the company’s own safety checks.

Three days earlier, another incident caused 120,000 lost orders and 1.6 million website errors. That one was directly tied to Amazon Q, the company’s AI coding assistant.

Amazon SVP Dave Treadwell told staff in an internal meeting that a “trend of incidents” had emerged since Q3 2025, with “several major” disruptions in recent weeks. The company is now rolling out a 90-day safety reset — the clearest admission yet that AI-assisted coding is outpacing the review processes designed to catch errors.

What Actually Happened

Business Insider obtained internal documents detailing two major incidents:

March 2 — The AI incident Customers across Amazon marketplaces saw incorrect delivery times when adding items to their carts. Root cause: a code change where Amazon Q was “one of the primary contributors.” The internal review was blunt:

“GenAI’s usage in control plane operations will accelerate exposure of sharp edges and places where guardrails do not exist. We need investment in control plane safety.”

March 5 — The catastrophic outage A production configuration change was deployed without using Amazon’s formal documentation and approval process (Modeled Change Management). A single authorized operator executed a high-blast-radius config change with no guardrails. Orders dropped to near zero across North America for hours.

“No automated pre-deployment validation. Single authorized operator could execute a high-blast-radius config change with no guardrails.”

Amazon clarified that only one incident was AI-related, and none involved fully AI-written code. But the pattern is clear: AI tools accelerate code production, and existing review processes weren’t designed for that velocity.

The 90-Day Reset

Treadwell’s response targets approximately 335 “Tier-1 systems” — services that can directly impact consumers and have experienced multiple order-impacting incidents since last year. The new rules:

Mandatory two-person review. Every code change requires two people to review before deployment. This was already policy but was being bypassed, especially for AI-assisted changes that felt “quick.”

Formal change documentation. Engineers must use Amazon’s internal approval tool for all changes. No more deploying without a paper trail.

Automated reliability checks. New systems that strictly enforce Amazon’s central reliability engineering rules before code reaches production.

Leadership audits. All Tier-1 system owners, plus Director and VP-level leaders, must audit all production code change activities within their organizations.

Treadwell framed it as “controlled friction” — deliberately slowing things down to catch problems before they reach customers.

The Speed-Safety Paradox

This is the defining tension of AI-assisted coding in 2026.

Amazon SVP Dave Treadwell had previously mandated that developers use AI coding tools (including Kiro, Amazon’s more recent offering) for at least 80% of their weekly coding tasks. The goal was productivity. Engineers could write more code, faster.

The problem: review processes are human-bottlenecked. When AI tools generate code at 3x the previous rate, the review pipeline either backs up (killing the productivity gains) or gets rushed (creating the exact failures Amazon experienced).

Treadwell acknowledged this directly, writing that the new guardrails will combine “agentic” tools (AI-driven checks) with “deterministic” systems (rules-based, predictable checks). The implication: you need AI to review AI-generated code, because humans can’t keep up.

But there’s a recursion problem. If AI coding tools cause incidents, and you deploy AI review tools to catch them, who reviews the AI reviewers?

The Broader Pattern

Amazon isn’t alone. The entire industry is navigating this:

GPT-5.4 launched with native computer use capabilities, enabling AI to write and deploy code with minimal human intervention
Google’s Workspace CLI includes MCP server integration, creating more surfaces where AI-generated changes touch production
Every major tech company has mandated AI coding tool adoption, creating the same velocity-versus-safety tension

The AWS cloud business wasn’t involved in these specific incidents, but the precedent matters. If AI-assisted code can take down Amazon.com, it can take down the infrastructure that runs a significant chunk of the internet.

What This Means for AI Agent Deployments

If you’re running AI agents that write or modify code — whether through OpenClaw, Codex, Claude Code, or any other tool — Amazon’s experience is instructive:

Review gates scale linearly, AI output scales exponentially. The bottleneck isn’t generating code. It’s validating it. Any system that relies on human review will eventually be overwhelmed.

“Quick” changes are the most dangerous. The March 2 incident involved AI-assisted changes that felt routine enough to skip full review. The March 5 incident involved a single operator making a “simple” config change. Both caused catastrophic failures.

Blast radius matters more than bug rate. AI-generated code might not have a higher bug rate than human code. But AI can generate changes that propagate broadly across systems in ways that weren’t anticipated — what Treadwell called “high blast radius changes.”

You need deterministic guardrails, not just AI guardrails. Treadwell’s “controlled friction” approach — mandatory documentation, two-person review, automated reliability checks — is fundamentally about adding deterministic constraints to a non-deterministic process.

For OpenClaw users who give agents tool access: this is the same principle as sandboxing, permission boundaries, and egress filtering. The more capable your agent, the more important the constraints around it.

Amazon learned this lesson at the cost of 6.3 million orders. The rest of the industry is watching.

For more on safe agent execution, read Amazon’s 90-day code safety reset memo, GPT-5.4 computer use and OpenClaw, and the OpenClaw guardrails guide.

Sources: Business Insider, TechHQ

Stop reading about it. Run it.

OpenClaw Cloud is the fastest way to get an AI agent that actually does things — from WhatsApp, Telegram, or any chat app. 24/7. From $19.9/mo with a 3-day money-back guarantee.

Try OpenClaw Cloud → Self-Host Free

Get Started with OpenClaw

Let OpenClaw handle your inbox, calendar, and daily tasks — from any chat app you already use.

Try OpenClaw Cloud Learn More

What Actually Happened

The 90-Day Reset

The Speed-Safety Paradox

The Broader Pattern

What This Means for AI Agent Deployments

Stop reading about it. Run it.

Related posts

Enclave Raises $6M to Find the Security Flaws Hiding in AI-Generated Code

The Reliability Gap: Why Your AI Agent Fails 6 Times a Day on a 10-Step Workflow

87% of AI-Agent PRs Had Security Bugs: DryRun's New Study Is a Wake-Up Call

Get Started with OpenClaw