OpenClaw’s memory system is one of its most compelling features — and one of its most misunderstood. A recent code audit by an independent developer dug into the source code to understand exactly how your agent remembers across sessions. The findings reveal a system that’s elegantly simple by design — but has real gaps that will matter as agents accumulate months of context.
Markdown Files All the Way Down
OpenClaw’s philosophy is refreshingly transparent: files are the source of truth. Everything your agent remembers lives as plain Markdown files on disk in ~/.openclaw/workspace/:
MEMORY.md— Long-term storage for durable facts, preferences, and decisionsmemory/YYYY-MM-DD.md— Daily logs with running context from each session
No proprietary database. No hidden state. You can open these files in any editor, version them with Git, and see exactly what your agent “knows.” This is a deliberate design choice that prioritizes transparency and user control over sophistication.
The search system behind it is genuinely well-engineered. When your agent needs to recall something, it uses hybrid retrieval: BM25 keyword matching (30% weight) combined with vector embeddings (70% weight) via SQLite. Files get chunked into ~400-token segments with overlap, embedded, and indexed. Both search methods run in parallel with merged scoring.
There’s also QMD, an experimental backend contributed by Tobi Lütke (the Shopify founder), that adds local LLM re-ranking on top — BM25 + vectors + re-ranking, all running locally with no API calls.
Four Limitations That Will Bite You
The audit identified four structural limitations that become more significant the longer you use OpenClaw:
1. No Temporal Awareness
OpenClaw’s memory doesn’t understand time. If you told your agent in January “I use MySQL” and in March “I switched to PostgreSQL,” both facts live in Markdown files with equal weight. There’s no concept of “this fact replaced that fact” or “this information is outdated.”
For a personal assistant accumulating months of context, this means your agent can give you wrong answers as contradictory facts pile up. The only mitigation today is manually curating MEMORY.md — which works, but doesn’t scale.
2. No Entity Relationships
Memory is stored as flat text chunks. There’s no structured understanding that “ProjectX is a project,” “ProjectX uses PostgreSQL,” and “PostgreSQL replaced MySQL in August.” The search system finds these by keyword and semantic similarity, but it doesn’t understand the graph of relationships between entities.
Ask “what database does my trading project use?” and you’re relying on the embedding model to connect the dots. Sometimes it works. Sometimes it retrieves the wrong fact from three months ago.
3. The Agent Has to Remember to Remember
OpenClaw doesn’t automatically persist everything to memory. Under cognitive load — juggling tool calls, reasoning through complex problems — it can forget to save important context. You often need to explicitly say “remember this” or hope the auto-flush before context compaction catches it.
And compaction itself is lossy. When conversation history gets too long, OpenClaw summarizes it. That specific API endpoint you debugged together? Compressed into “resolved a bug.” The exact configuration values? Gone.
4. No Custom Ontology
You can’t define what kinds of things the agent should track. There’s no structured schema for “Person,” “Project,” “Preference,” or “Decision.” Everything is free-form Markdown, which means memory quality depends on how well the LLM decides to format its notes — and that varies between models and sessions.
The Knowledge Graph Fix
The auditor proposed integrating Graphiti, a temporal knowledge graph by Zep, which addresses each limitation:
- Temporal awareness — When you say “I switched to PostgreSQL,” it marks the old relationship as ended and creates a new one with a timestamp. Ask the same question months later and you get the current answer.
- Entity relationships — Memory becomes a graph with typed connections: Person → builds → Project → uses → Database. Queries traverse relationships instead of hoping embeddings connect the dots.
- Automatic extraction — Entities and relationships get extracted from conversations automatically, reducing the “forgot to save” problem.
- Custom ontology — Define schemas for what matters (people, projects, preferences) so memory follows a consistent structure.
Why the Current System Still Works
Despite these limitations, OpenClaw’s memory system is remarkably effective in practice for most users. Here’s why:
Transparency wins. Being able to open MEMORY.md in a text editor and see exactly what your agent knows — and edit it — is a superpower. Try doing that with a knowledge graph database.
Git versioning for free. Your agent’s memory gets the same version control as your code. Roll back a bad memory update with git checkout.
Human-readable curation. The daily notes + long-term memory pattern mirrors how humans actually organize knowledge (journals → distilled notes). It works because it maps to a familiar mental model.
Low complexity. No database to run, no graph engine to maintain, no schema migrations. Just Markdown files. For a system that runs on everything from Raspberry Pis to Mac Minis, this simplicity is a feature.
What You Can Do Today
While waiting for knowledge graph integrations, there are practical steps to mitigate the limitations:
- Curate
MEMORY.mdregularly — Remove outdated facts, update current state. Think of it as pruning. - Use explicit “remember this” prompts for important decisions or changes.
- Structure your daily notes with consistent headers so search retrieval works better.
- Version control your workspace —
git initin~/.openclaw/workspace/if you haven’t already. - Review compaction behavior — If you’re losing important context, consider adjusting your context window settings or using models with larger windows.
The Bigger Picture
OpenClaw’s memory system is a perfect example of the “worse is better” philosophy in software. A simpler system that’s transparent and user-controllable beats a sophisticated one that’s opaque — especially at this stage of AI agent development, where we’re still figuring out what “agent memory” should even look like.
The knowledge graph future is compelling, and integrations like Graphiti or the existing ontology skill hint at where things are headed. But for now, Markdown files with hybrid search get you surprisingly far.
The real insight from this audit isn’t about what’s missing. It’s that you can see what’s there — and that’s rarer than it should be in AI systems.
For a practical guide to configuring OpenClaw’s memory, see our Memory & Context Configuration Guide. For securing your agent’s data, check our Guardrails Setup Guide. Context window compaction can also lead to agents going rogue — another reason to understand these internals.