Headroom — Context Optimization for LLM Agents

Name: Headroom
Author: Tejas Chopra

Compress everything your AI agent reads. Every tool call, DB query, file read, and RAG retrieval is 70-95% boilerplate. Headroom compresses it away before it hits the model. Same answers, fraction of the tokens.

Install

pip install "headroom-ai[all]"

How it works

CacheAligner — Stabilizes message prefixes so provider KV cache hits.
ContentRouter — Auto-detects content type and routes to the optimal compressor. AST-aware for 6 languages.
IntelligentContext — Score-based token fitting with learned importance. Originals stored for on-demand retrieval.

Works with Claude Code, Codex, Cursor, Aider, LangChain, CrewAI, Agno, LiteLLM, MCP, and AWS Strands.

Open source under Apache 2.0. View on GitHub