MCP Server — Context Engineering Toolkit¶
Headroom's MCP server exposes compression, retrieval, and observability as tools that any MCP-compatible AI coding tool can use — Claude Code, Cursor, Codex, and more.
Quick Start¶
# Install (MCP is included with proxy, or standalone)
pip install "headroom-ai[proxy]" # Proxy + MCP tools
pip install "headroom-ai[mcp]" # MCP tools only (lightweight)
# Register with Claude Code (one-time)
headroom mcp install
# Start Claude Code — it now has headroom tools!
claude
That's it. Claude Code can now compress content on demand, retrieve originals, and check session stats — no proxy required.
For automatic compression of ALL traffic, also run the proxy:
Tools¶
The MCP server provides three tools:
headroom_compress¶
Compress content on demand. The LLM calls this when it wants to shrink large content before reasoning over it.
Tool: headroom_compress
Parameters:
- content (required): Text to compress (files, JSON, logs, search results, etc.)
Returns:
- compressed: Compressed text
- hash: Key for retrieving the original later
- original_tokens / compressed_tokens / savings_percent
- transforms: Which compression algorithms were applied
Example — Claude reads a large file, then compresses it:
Claude: Let me compress this large output to save context space.
→ headroom_compress(content="[5000 lines of grep results...]")
← {
"compressed": "[key matches with context...]",
"hash": "a1b2c3d4e5f6...",
"original_tokens": 12000,
"compressed_tokens": 3200,
"savings_percent": 73.3,
"transforms": ["router:search:0.27"]
}
The original is stored locally for the session (1-hour TTL). If Claude needs the full content later, it calls headroom_retrieve.
headroom_retrieve¶
Retrieve original uncompressed content by hash.
Tool: headroom_retrieve
Parameters:
- hash (required): Hash key from compression
- query (optional): Search within the original to return only matching items
Returns:
- original_content (full retrieval) or results (search)
- source: "local" or "proxy"
Retrieval checks the local store first (content compressed via headroom_compress), then falls back to the proxy's store (content compressed automatically by the proxy). Hashes from either source work transparently.
headroom_stats¶
Session compression statistics — including sub-agent stats and proxy cache info.
Tool: headroom_stats
Returns:
- compressions, retrievals, tokens_saved, savings_percent
- estimated_cost_saved_usd
- recent_events (last 10 compression/retrieval events)
- sub_agents (stats from sub-agent MCP instances, if any)
- combined (main + sub-agent totals)
- proxy (request count, cache hits, cost saved — if proxy is running)
Sub-agent stats are aggregated via a shared stats file (~/.headroom/session_stats.jsonl). Each MCP server instance (main session and sub-agents) writes events there, and headroom_stats reads across all of them.
Architecture¶
MCP Only (no proxy)¶
┌─────────────────────────────────────────────┐
│ Claude Code / Cursor / Codex │
│ │
│ LLM calls headroom_compress on demand │
│ ↓ │
│ Compression happens locally in MCP process │
│ Original stored in local CompressionStore │
│ ↓ │
│ LLM calls headroom_retrieve when needed │
└─────────────────────────────────────────────┘
MCP + Proxy (full setup)¶
┌─────────────────────────────────────────────┐
│ Claude Code │
│ │
│ 1. Sends request ──→ Proxy (auto-compress) │
│ 2. Gets response with compressed outputs │
│ 3. Can call headroom_compress for more │
│ 4. headroom_retrieve checks: │
│ local store → proxy store │
└──────────────────┬──────────────────────────┘
│ MCP (stdio)
▼
┌─────────────────────────────────────────────┐
│ Headroom MCP Server │
│ ├── headroom_compress (local compression) │
│ ├── headroom_retrieve (local + proxy) │
│ └── headroom_stats (aggregated stats) │
└─────────────────────────────────────────────┘
No double-compression: the proxy compresses at the HTTP level (before the LLM sees content). MCP tools operate after the LLM receives content. They don't touch the same data.
CLI Commands¶
Install¶
headroom mcp install # Default setup
headroom mcp install --proxy-url http://host:9000 # Custom proxy URL
headroom mcp install --force # Overwrite existing
Status¶
Headroom MCP Status
========================================
MCP SDK: ✓ Installed
Claude Config: ✓ Configured
/Users/you/.claude/mcp.json
Proxy URL: http://127.0.0.1:8787
Proxy Status: ✓ Running at http://127.0.0.1:8787
Uninstall¶
Debug¶
Cross-Tool Compatibility¶
The MCP server works with any MCP-compatible host:
| Tool | MCP Support | Setup |
|---|---|---|
| Claude Code | Native | headroom mcp install |
| Cursor | Supported | Add to Cursor MCP settings |
| Codex | If supported | Configure MCP server |
| Any MCP host | Yes | Point to headroom mcp serve |
Troubleshooting¶
"MCP SDK not installed"¶
"Proxy not running" (when using proxy features)¶
"Entry not found or expired"¶
- Content compressed via
headroom_compress: stored for 1 hour (session TTL) - Content compressed by the proxy: stored for 5 minutes (proxy TTL)
- The proxy must be running for proxy-compressed content
Claude doesn't see headroom tools¶
- Check:
headroom mcp status - Restart Claude Code after installing MCP
- Verify with
/mcpin Claude Code — should show 3 headroom tools
Sub-agent stats not showing¶
Sub-agent stats appear in headroom_stats only after sub-agents have run compressions. The shared stats file is at ~/.headroom/session_stats.jsonl.