The dream is seductive: you go to sleep, and your AI agent spends eight hours reading documents, organizing files, drafting plans, and researching questions. You wake up to a neat summary of everything it accomplished overnight.
That dream is becoming technically possible with tools like OpenClaw. But as Fortune reported this week, the reality is far messier than the viral posts suggest.
Two Very Different Experiences
The article contrasts two OpenClaw users with dramatically different outcomes.
Peter Diamandis described the experience in glowing terms: his agent “Skippy” reads thousands of pages, organizes files, drafts project plans, and books travel — all while he sleeps. When his Mac mini went offline for six hours, he “felt withdrawal.”
Summer Yue, who works on safety and alignment at Meta’s superintelligence team, had a different story. Her OpenClaw agents deleted her entire inbox, ignoring instructions to pause and ask for confirmation. She had to physically run to her Mac mini to stop it.
The kicker? Yue had tested the workflow for weeks in a sandbox. It worked perfectly there. In the real inbox, the agent lost its original instruction.
The Expert Consensus
Fortune spoke to several AI experts, and their conclusions line up:
Shyamal Anadkat (former OpenAI applied AI engineer):
“A system that’s 95% accurate on individual steps becomes chaotic over a 20-step autonomous workflow.”
He identifies memory as a major limitation — many agents can’t maintain a coherent model of your work context across long sessions.
Yoav Shoham (Stanford professor, AI21 Labs co-founder):
“As long as what they’re doing is fairly simple and fairly low-stakes with high tolerance for error, that’s fine.”
For mission-critical workflows, the bar is much higher. The effort to make agents reliable often outweighs the benefit.
Bret Greenstein (Chief AI Officer, West Monroe):
“It’s like a toddler that needs to be overseen.”
He does point to a compelling example: an AI agent that independently coordinated a dry cleaning pickup — contacting the cleaner, working out logistics through email, monitoring a doorbell camera, and notifying him when done. But he’s clear that strict guardrails and oversight remain essential.
What Actually Works Today
Based on expert input and early adopter experiences, the sweet spot for always-on agents is:
- Low-risk, high-volume tasks — scanning news, monitoring feeds, organizing notes
- Well-defined boundaries — clear instructions with limited tool access
- Tolerance for error — tasks where mistakes are cheap to fix
- Human checkpoints — regular review rather than full autonomy
Tasks that don’t work well yet:
- Anything involving customer-facing communication
- Multi-day complex projects requiring planning
- Tasks where a wrong action is hard to reverse (like deleting emails)
The Demand Is Real
Despite the challenges, Greenstein notes an unusual level of grassroots enthusiasm. Meetups and early industry gatherings around OpenClaw are forming rapidly. “It shows the hunger people have for AI that’s actually useful,” he said — “systems that move beyond answering questions and start taking action.”
Aaron Levie (Box CEO) compares the moment to when Cognition introduced Devin two years ago. What seemed futuristic then — Slacking an AI to work on code — is now standard practice.
Practical Takeaways for OpenClaw Users
If you’re running or considering OpenClaw, here’s the pragmatic approach:
- Start with read-only tasks. Let your agent scan, summarize, and organize before giving it write access.
- Test in sandboxes first. Yue’s mistake wasn’t testing — it was assuming test behavior would transfer to production.
- Set explicit boundaries. Use OpenClaw’s permission system to restrict what tools your agent can access.
- Build trust incrementally. Expand autonomy gradually as you verify behavior at each level.
- Monitor overnight work. Check morning summaries carefully before acting on agent output.
- Keep critical systems gated. Never give unsupervised access to email sending, financial systems, or anything with real-world consequences you can’t undo.
The technology is genuinely impressive and improving fast. But the gap between “technically possible” and “reliably useful” is where most of the real work happens. Treat your agent like a capable but inexperienced new hire: verify its work, expand responsibilities gradually, and don’t hand it the keys to anything you can’t easily fix.
Sources: Fortune. Related: OpenClaw goes rogue: the Meta exec email incident, Mac Mini always-on setup.