Headroom¶
The Context Optimization Layer for LLM Applications
Tool outputs are 70-95% redundant. Headroom compresses that away—without losing information.
Quick Install¶
Quick Start¶
Option 1: Proxy (Zero Code Changes)¶
Start the proxy:
Point your tools at it:
That's it. Your existing code works unchanged, with 40-90% fewer tokens.
Option 2: Python SDK¶
from headroom import Headroom
hr = Headroom()
# Compress tool output before sending to LLM
compressed = hr.compress(large_tool_output)
# If LLM needs the full data, retrieve it
original = hr.retrieve(compressed)
Why Headroom?¶
| Problem | Solution |
|---|---|
| Tool outputs bloat context with repetitive JSON | Statistical compression removes redundancy |
| Dynamic content breaks provider caching | Cache alignment stabilizes prefixes |
| Long conversations exceed context limits | Intelligent scoring drops low-value messages |
| Compressed data might be needed later | CCR stores originals for on-demand retrieval |
Results¶
100 log entries. One critical error buried at position 67.
| Metric | Baseline | Headroom |
|---|---|---|
| Input tokens | 10,144 | 1,260 |
| Correct answers | 4/4 | 4/4 |
87.6% fewer tokens. Same answer.
The FATAL error was automatically preserved—no configuration needed.
How It Works¶
- Intercepts context — Tool outputs, logs, search results
- Compresses intelligently — Keeps errors, outliers, boundaries
- Stores originals — Full data available if LLM requests it
- Aligns for caching — Provider caches actually hit
Integrations¶
Features¶
Compression
- Statistical JSON array compression (no hardcoded rules)
- ML-based text compression via LLMLingua
- AST-aware code compression
- Image optimization (40-90% reduction)
Context Management
- Intelligent message scoring and dropping
- Compress-Cache-Retrieve (CCR) for lossless compression
- Provider cache alignment for better hit rates
Operations
- Prometheus metrics endpoint
- Request logging and cost tracking
- Budget limits and rate limiting
Next Steps¶
- Quickstart Guide — Get running in 5 minutes
- Proxy Documentation — Configure the optimization proxy
- Architecture — Deep dive into how it works
License¶
Apache 2.0 — Free for commercial use.