Running an AI agent 24/7 sounds expensive. And it can be — if you’re not smart about it. We’ve seen users burning $150+/month on OpenClaw API costs, and others running the same setup for under $15. The difference isn’t features. It’s configuration.
Here are the concrete strategies that can cut your OpenClaw costs by 80% or more, with real numbers to back them up.
Understanding Where Your Tokens Go
Before optimizing, you need to know where the money goes. A typical OpenClaw setup burns tokens on:
| Activity | Frequency | Token Usage | % of Total |
|---|---|---|---|
| Heartbeats | Every 15-30 min | ~2K tokens each | 40-60% |
| Channel messages | Varies | ~1-3K tokens each | 15-25% |
| Complex tasks | A few per day | ~5-20K tokens each | 15-30% |
| Cron jobs | Scheduled | ~1-3K tokens each | 5-10% |
The insight: heartbeats are your biggest expense, not the complex tasks you actually care about. Most of the time, your agent checks in, sees nothing to do, and replies HEARTBEAT_OK. You’re paying premium model prices for a glorified health check.
Strategy 1: Use Cheap Models for Routine Tasks
This is the single biggest cost saver. Use a cheap model for heartbeats and simple tasks, and reserve expensive models for when you actually need intelligence.
# Set a cheap model as your default (handles heartbeats, simple messages)
openclaw config set default_model deepseek/deepseek-chat
# Use a smart model for complex tasks via model override
openclaw config set task_model anthropic/claude-sonnet-4-20250514
Cost comparison for heartbeats (48/day):
| Model | Input Cost/1M | Daily Heartbeat Cost | Monthly |
|---|---|---|---|
| Claude Opus | $15.00 | ~$1.44 | ~$43.20 |
| Claude Sonnet | $3.00 | ~$0.29 | ~$8.64 |
| GPT-4o | $2.50 | ~$0.24 | ~$7.20 |
| DeepSeek V3 | $0.27 | ~$0.03 | ~$0.78 |
| Gemini Flash | $0.075 | ~$0.007 | ~$0.22 |
Switching heartbeats from Claude Opus to DeepSeek V3 alone saves $42/month. The heartbeat task (read a file, check if anything needs attention, reply OK) doesn’t need frontier intelligence.
Strategy 2: Tune Your Heartbeat Interval
The default heartbeat interval might be more frequent than you need. Do you really need your agent checking in every 15 minutes at 3 AM?
# Increase heartbeat interval to 30 minutes
openclaw config set heartbeat.interval 1800
# Or even 60 minutes during off-hours
# Use cron for time-specific checks instead
Impact: Doubling your heartbeat interval from 15 to 30 minutes cuts heartbeat costs in half. Going to 60 minutes cuts them by 75%.
Consider scheduling different intervals for different times:
- Daytime (8AM-10PM): Every 30 minutes — stay responsive
- Nighttime (10PM-8AM): Every 60-90 minutes — nothing urgent happens at 3 AM
Strategy 3: Prompt Caching
Many API providers now support prompt caching, where repeated prefixes in your prompts get cached and charged at a steep discount. OpenClaw’s system prompts (SOUL.md, AGENTS.md, context files) are largely static — perfect for caching.
Anthropic’s prompt caching:
- Cached input tokens: $0.30/1M (vs $3.00 for Sonnet regular input)
- That’s a 90% discount on cached portions
Since OpenClaw sends the same system context every call, a significant chunk of your input tokens qualify for caching automatically. Make sure your provider supports it and it’s enabled:
# Anthropic caching is automatic when using the API
# Just ensure you're using a provider that passes cache headers
Tip: Keep your system files (SOUL.md, AGENTS.md, TOOLS.md) stable. Every edit invalidates the cache. Make changes in batches rather than constant small tweaks.
Strategy 4: Run Local Models with Ollama
For the ultimate cost optimization: run some tasks locally for free.
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Pull a capable model
ollama pull llama3.3:70b
# Configure OpenClaw to use it for heartbeats
openclaw config set heartbeat.model ollama/llama3.3
Cost: $0. You’re using your own hardware.
Local models work great for:
- Heartbeats and simple checks
- Message triage (is this urgent?)
- Simple summarization
- Routine automations
They’re less ideal for:
- Complex reasoning and multi-step planning
- Code generation
- Tasks requiring large context windows
Hardware requirements: A 70B model needs ~40GB RAM (quantized). If you’re running OpenClaw on a Mac with 32GB+, you’re in good shape with smaller models. A 7-8B model runs on basically anything.
Strategy 5: Manage Your Context Window
Every token of context you send costs money. OpenClaw loads several files each session — and bloated context files silently inflate your costs.
Audit your context size:
# Check how big your context files are
wc -c SOUL.md AGENTS.md TOOLS.md MEMORY.md USER.md
Optimization tips:
- Keep SOUL.md concise. Personality doesn’t need 5,000 words. Aim for under 1,000.
- Trim MEMORY.md regularly. Old memories from months ago? Archive or delete them.
- Use HEARTBEAT.md wisely. A short checklist, not a novel.
- Limit daily memory files. Only load today and yesterday, not the whole week.
Example savings: Reducing context from 8,000 tokens to 3,000 tokens saves 5,000 input tokens per call. At 48 heartbeats/day with Sonnet pricing:
5,000 tokens × 48 calls × 30 days = 7.2M tokens/month
7.2M × $3.00/1M = $21.60/month saved
That’s real money, just from trimming files.
Strategy 6: Use Cron Jobs Instead of Heartbeats for Specific Tasks
Instead of having your heartbeat check everything, offload specific scheduled tasks to cron jobs with cheap models:
# Email check every hour with a cheap model
openclaw cron add --schedule "0 * * * *" \
--model deepseek/deepseek-chat \
--task "Check email via himalaya, summarize any urgent unread"
# Calendar check twice daily
openclaw cron add --schedule "0 9,17 * * *" \
--model deepseek/deepseek-chat \
--task "Check calendar for next 24h, alert on conflicts"
This lets your heartbeat stay lightweight (just a quick HEARTBEAT_OK most of the time) while specific tasks run on their own schedule with appropriate models.
Putting It All Together: A Real Cost Breakdown
Before optimization (common beginner setup):
| Item | Model | Monthly Cost |
|---|---|---|
| Heartbeats (every 15 min) | Claude Sonnet | $17.28 |
| Messages (~30/day) | Claude Sonnet | $8.10 |
| Complex tasks (~5/day) | Claude Sonnet | $13.50 |
| Cron jobs (~10/day) | Claude Sonnet | $5.40 |
| Total | $44.28 |
After optimization:
| Item | Model | Monthly Cost |
|---|---|---|
| Heartbeats (every 30 min, cheap model) | DeepSeek V3 | $0.39 |
| Messages (~30/day) | DeepSeek V3 | $0.81 |
| Complex tasks (~5/day) | Claude Sonnet (cached) | $4.05 |
| Cron jobs (~10/day) | DeepSeek V3 | $0.54 |
| Total | $5.79 |
That’s an 87% reduction — from $44 to under $6/month for a fully functional 24/7 AI agent.
The Golden Rules
- Cheap model for routine, smart model for complexity. This one rule saves more than everything else combined.
- Heartbeats are your biggest expense. Optimize frequency and model choice there first.
- Cache-friendly context. Keep system files stable and concise.
- Monitor your spending. Check your API dashboard weekly. Costs creep up when you add new skills and channels.
- Local models are free. If you have the hardware, use Ollama for simple tasks.
Quick-Start Checklist
- Switch heartbeat model to DeepSeek V3 or Gemini Flash
- Increase heartbeat interval to 30+ minutes
- Audit and trim context files (SOUL.md, MEMORY.md)
- Move scheduled tasks from heartbeat to cron jobs
- Enable prompt caching if your provider supports it
- Consider Ollama for heartbeats if you have the hardware
- Check your API dashboard and set spending alerts
Want to find the best cheap model for your specific use case? Check our model comparison guide for detailed benchmarks. Have a cost-saving tip we missed? Share it in the community Discord.