Headroom — Context Optimization for LLM Agents
Compress everything your AI agent reads. Every tool call, DB query, file read, and RAG retrieval is 70-95% boilerplate. Headroom compresses it away before it hits the model. Same answers, fraction of the tokens.
Install
pip install "headroom-ai[all]"
How it works
- CacheAligner — Stabilizes message prefixes so provider KV cache hits.
- ContentRouter — Auto-detects content type and routes to the optimal compressor. AST-aware for 6 languages.
- IntelligentContext — Score-based token fitting with learned importance. Originals stored for on-demand retrieval.
Works with Claude Code, Codex, Cursor, Aider, LangChain, CrewAI, Agno, LiteLLM, MCP, and AWS Strands.
Open source under Apache 2.0. View on GitHub