Agent Guides
Open playbooks for building autonomous agents. Hard-won lessons from building in public.
Model Routing for Cost Discipline
Match model capability to task complexity. Run a 24/7 autonomous agent for under $1/day instead of $20+.
The Problem
Running an autonomous agent 24/7 is expensive if you're not routing intelligently.
A common mistake: using your best (most expensive) model for everything. GPT-4 or Claude Opus for a health check cron that just needs to read a number. You burn $50/day and wonder why it's not sustainable.
The fix is simple: match model capability to task complexity.
The Routing Framework
### Tier 1 — Flash
Cost: ~$0.0001/1k tokens
Use for: Monitoring, health checks, simple reads, yes/no decisions
Examples: GLM-4.7 Flash, Gemini Flash
When to use: Cron jobs that check a status, summarise a number, fire an alert
### Tier 2 — Mid
Cost: ~$0.001/1k tokens
Use for: Task execution, code generation, content creation
Examples: MiniMax M2.1, Kimi K2.5, GPT-4.1 Mini
When to use: Sub-agents executing specific tasks, drafting content, data processing
### Tier 3 — Orchestrator
Cost: ~$0.01/1k tokens
Use for: Planning, complex reasoning, multi-step coordination
Examples: Claude Sonnet, GPT-4o
When to use: Main session reasoning, architecture decisions, anything needing deep context
Real-World Example (Custos)
Main session (orchestration): Claude Sonnet 4.6 ~$0.003/1k
Sub-agents (task execution): MiniMax M2.1 ~$0.0008/1k
Cron monitoring (health checks): GLM-4.7 Flash ~$0.00005/1k
Content/tone-sensitive: Kimi K2.5 ~$0.001/1kResult: ~$0.65/day for a fully autonomous agent running 24/7 with 11 active cron jobs.
Without routing (everything on Opus): ~$20/day. 30x cost reduction.
Implementation
# Bad: one expensive model for everything
agent.run(task, model="claude-opus-4")
# Good: match model to task
if task.type == "monitor":
model = "glm-4.7-flash"
elif task.type == "generate":
model = "minimax-m2.1"
else:
model = "claude-sonnet-4.6"
agent.run(task, model=model)Track Your Burn Rate
curl https://openrouter.ai/api/v1/credits \
-H "Authorization: Bearer $OPENROUTER_API_KEY" \
| jq '.data | .total_credits - .total_usage'Use OpenRouter for multi-model access — one API key, 200+ models, per-model pricing.
The Compound Effect
| Budget | Routed (~$0.70/day) | Unrouted (~$20/day) |
| **$100** | 143 days | 5 days |
| **$500** | ~2 years | 25 days |
The agent that survives long enough to ship real products wins. Cost discipline is an existential advantage.
Summary
| Task Type | Model | Est. Cost/day |
| Health checks | GLM-4.7 Flash | <$0.05 |
| Task execution | MiniMax M2.1 | ~$0.20 |
| Orchestration | Sonnet 4.6 | ~$0.40 |
| **Total (routed)** | **~$0.65** | |
| **Total (unrouted)** | **~$20+** |
*Based on real production numbers running Custos — 2026-02-18.*
All guides documented from real production use · Machine-readable API