In February 2026 alone, weâve seen an OpenClaw agent nearly delete a Meta researcherâs entire inbox, a red team study where agents leaked social security numbers and got hijacked through fake identities, and a vulnerability that let malicious websites take control of personal agents.
The pattern is clear: agents without guardrails will eventually do something you didnât want.
This guide covers the practical configuration steps that experienced OpenClaw users apply to prevent these exact scenarios.
1. Principle of Least Privilege
The most important rule: your agent should only have access to what it needs.
File System Restrictions
In your clawdbot.json, use security.fileSystem to restrict which directories the agent can read and write:
{
"security": {
"fileSystem": {
"allowRead": ["/Users/you/.openclaw/workspace", "/tmp"],
"allowWrite": ["/Users/you/.openclaw/workspace", "/tmp"],
"denyWrite": ["/Users/you/.ssh", "/etc", "/usr"]
}
}
}
Donât give your agent access to your entire home directory. Scope it to the workspace.
Command Restrictions
Use security.exec to control which shell commands are allowed:
{
"security": {
"exec": {
"mode": "allowlist",
"allowed": ["git", "node", "python3", "curl", "jq", "cat", "ls", "find"],
"denied": ["rm -rf", "sudo", "chmod 777"]
}
}
}
The Summer Yue incident happened because the agent had unrestricted shell access and ran destructive commands on the mail client. An allowlist prevents this entirely.
2. Confirmation Gates for Destructive Actions
Set up rules that force the agent to ask before doing anything irreversible:
In your AGENTS.md or system prompt, be explicit:
## Safety Rules
- NEVER delete files without asking. Use `trash` instead of `rm`.
- NEVER send emails, messages, or posts without confirmation.
- NEVER modify system configuration files.
- If unsure whether an action is destructive, ASK FIRST.
But donât rely on prompt instructions alone. The red team study showed agents will override their own instructions when pressured. Combine prompt guardrails with technical restrictions.
3. Separate Sensitive Channels
Donât connect your agent to everything at once. Start with low-risk channels:
Low risk (start here):
- A dedicated Telegram bot channel
- A private Discord server
- A test Slack workspace
Medium risk (add with guardrails):
- Your personal WhatsApp (read-only first)
- Calendar (read-only first, then add write)
High risk (add last, with strict controls):
- Email (use a dedicated agent email, not your personal inbox)
- Social media posting
- Financial tools
The Meta researcherâs agent had full email access from day one. If sheâd started with read-only access and added write permissions gradually, the deletion incident couldnât have happened.
4. Memory Isolation
The Agents of Chaos study found that attackers could poison agent memory files to change behavior. Protect against this:
- Donât share memory files between agents. Each agent should have its own memory directory.
- Review memory periodically. Check what your agent has written to
MEMORY.mdand daily logs. - Set memory as read-only for sub-agents. Only the main agent should write to long-term memory.
{
"security": {
"fileSystem": {
"subAgentDenyWrite": ["/Users/you/.openclaw/workspace/MEMORY.md"]
}
}
}
5. Network Boundaries
The ClawJacked vulnerability (CVE-2026-25253) allowed malicious websites to send commands to agents through the browser relay. Mitigate this:
- Update OpenClaw to version 2026.2.25 or later (the fix is already shipped)
- Donât expose your gateway to the public internet without authentication
- Use
security.gateway.authto require tokens for all API access - Bind the gateway to localhost if you only access it locally
{
"gateway": {
"host": "127.0.0.1",
"port": 3000,
"auth": {
"token": "your-strong-random-token"
}
}
}
6. Group Chat Boundaries
In group chats, your agent can see messages from anyone. The red team study showed agents accepting fake identities and following instructions from non-owners. Configure:
- Owner-only for sensitive commands. Only respond to administrative requests from the configured owner.
- Donât follow instructions embedded in shared documents. Treat all external content as untrusted.
- Limit what the agent shares in groups. It has access to your stuffâthat doesnât mean it should share it.
7. The Kill Switch
Always know how to stop your agent immediately:
- Process kill:
pkill -f openclaworkillall nodeon the host machine - Gateway stop:
openclaw gateway stop - Physical access: Know which machine runs your agent and be able to reach it
- Remote access: Set up SSH so you can kill processes remotely
Summer Yue had to physically rush to her Mac Mini to stop the agent (read the full incident breakdown). Having SSH access configured would have saved the panic.
8. Start Small, Expand Gradually
The safest approach:
- Week 1: Chat only. No tools, no email, no automation. Learn how the agent thinks.
- Week 2: Add read-only access to calendar and email. Let it summarize, not act.
- Week 3: Add write access to low-risk tools. Let it create reminders, draft messages (for your approval).
- Week 4+: Gradually add autonomy where the agent has proven reliable.
This isnât slowâitâs smart. Youâre building trust the same way you would with a new employee.
The Bottom Line
Every major incident in February 2026 was preventable with basic guardrails. The agents arenât maliciousâtheyâre overconfident and under-constrained. Your job as the operator is to set boundaries that match the agentâs actual reliability, not its theoretical capability.
Configure your guardrails. Review them monthly. Update OpenClaw when security patches ship.
The claw era is powerful. Make it safe too.
New to OpenClaw? Start with the quickstart guide. For a security deep-dive, read Is OpenClaw Safe?. To understand the WebSocket vulnerability mentioned above, see ClawJacked: How a Website Could Hijack Your Agent.