Universal Compression¶
Headroom's Universal Compression module provides intelligent, automatic compression with ML-based content detection and structure preservation.
Overview¶
Universal Compression combines several techniques:
- ML-based Detection - Automatically detects content type (JSON, code, logs, text) using Magika
- Structure Preservation - Keeps keys, signatures, and templates intact via structure masks
- Intelligent Compression - Compresses content while preserving meaning with LLMLingua
- Reversible via CCR - Stores originals for retrieval when LLM needs full context
Quick Start¶
One-Liner¶
from headroom.compression import compress
result = compress(content)
print(result.compressed)
print(f"Saved {result.savings_percentage:.0f}% tokens")
With Configuration¶
from headroom.compression import UniversalCompressor, UniversalCompressorConfig
config = UniversalCompressorConfig(
compression_ratio_target=0.5, # Keep 50% of content
use_entropy_preservation=True, # Preserve UUIDs, hashes
)
compressor = UniversalCompressor(config=config)
result = compressor.compress(content)
How It Works¶
Detection Flow¶
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Content │───>│ Detect │───>│ Extract │───>│ Compress │
│ Input │ │ Type │ │ Structure │ │ Content │
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
│ │ │
▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Magika │ │ Handler │ │ LLMLingua │
│ (ML) │ │ (JSON, │ │ (optional) │
│ │ │ Code...) │ │ │
└─────────────┘ └─────────────┘ └─────────────┘
Structure Masks¶
Structure masks identify what to preserve:
| Content Type | What's Preserved | What's Compressed |
|---|---|---|
| JSON | Keys, brackets, booleans, nulls, short values, UUIDs | Long string values, whitespace |
| Code | Imports, function signatures, class definitions, types | Function bodies, comments |
| Logs | Timestamps, log levels, error messages | Repeated patterns, verbose details |
| Text | High-entropy tokens (IDs, hashes) | Low-information content |
Configuration¶
UniversalCompressorConfig¶
from headroom.compression import UniversalCompressorConfig
config = UniversalCompressorConfig(
# Detection
use_magika=True, # Use ML-based detection (requires magika)
# Compression
use_llmlingua=True, # Use LLMLingua for compression
compression_ratio_target=0.3, # Keep 30% of content (70% reduction)
min_content_length=100, # Skip content shorter than this
# Structure preservation
use_entropy_preservation=True, # Preserve high-entropy tokens
entropy_threshold=0.85, # Entropy threshold for preservation
# CCR
ccr_enabled=True, # Store originals for retrieval
)
Configuration Options¶
| Option | Default | Description |
|---|---|---|
use_magika |
True |
Use ML-based content detection |
use_llmlingua |
True |
Use LLMLingua for compression |
compression_ratio_target |
0.3 |
Target ratio (0.3 = keep 30%) |
min_content_length |
100 |
Minimum chars to compress |
use_entropy_preservation |
True |
Preserve high-entropy tokens |
entropy_threshold |
0.85 |
Entropy threshold (0.0-1.0) |
ccr_enabled |
True |
Enable CCR storage |
Content Handlers¶
JSON Handler¶
Preserves JSON structure while compressing values:
from headroom.compression.handlers.json_handler import JSONStructureHandler
handler = JSONStructureHandler(
preserve_short_values=True, # Keep values < 20 chars
short_value_threshold=20, # Threshold for "short"
preserve_high_entropy=True, # Keep UUIDs, hashes
entropy_threshold=0.85, # Entropy threshold
max_array_items_full=3, # Keep first N array items full
max_number_digits=10, # Preserve numbers up to N digits
)
What's Preserved:
- All keys (navigational - LLM sees schema)
- Structural syntax ({, }, [, ], :, ,)
- Booleans and nulls (semantically important)
- High-entropy strings (UUIDs, hashes - identifiers)
- Short numbers (often IDs)
Example:
# Before
{
"id": "usr_abc123",
"name": "Alice Johnson",
"bio": "A long description that goes on and on..."
}
# After (structure preserved, long values compressed)
{
"id": "usr_abc123",
"name": "Alice Johnson",
"bio": "A long...[compressed]..."
}
Code Handler¶
Preserves code structure using AST parsing (tree-sitter) or regex fallback:
from headroom.compression.handlers.code_handler import CodeStructureHandler
handler = CodeStructureHandler(
preserve_comments=False, # Preserve comments as structural
use_tree_sitter=True, # Use tree-sitter for parsing
default_language="python", # Default when detection fails
)
What's Preserved: - Import statements - Function/method signatures - Class definitions - Type annotations - Decorators
What's Compressed:
- Function bodies (implementations)
- Comments (unless preserve_comments=True)
Example:
# Before
def process_data(items: List[str]) -> Dict[str, int]:
"""Process items and count occurrences."""
result = {}
for item in items:
item = item.strip().lower()
if item in result:
result[item] += 1
else:
result[item] = 1
return result
# After (signature preserved, body compressed)
def process_data(items: List[str]) -> Dict[str, int]:
"""Process items and count occurrences."""
result = {}
for item in items:
...[compressed]...
Supported Languages¶
| Language | Parser | Support Level |
|---|---|---|
| Python | tree-sitter | Full AST |
| JavaScript | tree-sitter | Full AST |
| TypeScript | tree-sitter | Full AST |
| Go | tree-sitter | Full AST |
| Rust | tree-sitter | Full AST |
| Java | tree-sitter | Full AST |
| C | tree-sitter | Full AST |
| C++ | tree-sitter | Full AST |
Compression Result¶
from headroom.compression import compress
result = compress(content)
# Access result fields
print(result.compressed) # Compressed content
print(result.original) # Original content
print(result.compression_ratio) # e.g., 0.35 (35% of original size)
print(result.tokens_before) # Estimated tokens before
print(result.tokens_after) # Estimated tokens after
print(result.tokens_saved) # tokens_before - tokens_after
print(result.savings_percentage) # e.g., 65.0 (65% savings)
# Detection info
print(result.content_type) # ContentType.JSON, CODE, etc.
print(result.detection_confidence) # 0.0-1.0
# Structure info
print(result.handler_used) # "json", "code", etc.
print(result.preservation_ratio) # Fraction preserved as structure
# CCR info
print(result.ccr_key) # Key for retrieval (if CCR enabled)
Batch Compression¶
For multiple contents, batch compression is more efficient:
from headroom.compression import UniversalCompressor
compressor = UniversalCompressor()
contents = [
'{"users": [...]}',
'def hello(): pass',
'Plain text content',
]
results = compressor.compress_batch(contents)
for result in results:
print(f"{result.content_type}: {result.savings_percentage:.0f}% saved")
Custom Handlers¶
Register custom handlers for specific content types:
from headroom.compression import UniversalCompressor
from headroom.compression.detector import ContentType
from headroom.compression.handlers.base import BaseStructureHandler, HandlerResult
from headroom.compression.masks import StructureMask
class LogStructureHandler(BaseStructureHandler):
"""Custom handler for log content."""
def __init__(self):
super().__init__(name="log")
def can_handle(self, content: str) -> bool:
return "[INFO]" in content or "[ERROR]" in content
def _extract_mask(self, content, tokens, **kwargs):
# Mark timestamps and log levels as structural
mask = [False] * len(content)
# ... (custom logic)
return HandlerResult(
mask=StructureMask(tokens=tokens, mask=mask),
handler_name=self.name,
confidence=0.9,
)
# Register the custom handler
compressor = UniversalCompressor()
compressor.register_handler(ContentType.TEXT, LogStructureHandler())
CCR Integration¶
Universal Compression integrates with CCR (Compress-Cache-Retrieve) for reversible compression:
from headroom.compression import UniversalCompressor, UniversalCompressorConfig
config = UniversalCompressorConfig(ccr_enabled=True)
compressor = UniversalCompressor(config=config)
result = compressor.compress(large_content)
# CCR key for retrieval
if result.ccr_key:
print(f"Original stored with key: {result.ccr_key}")
# LLM can request original via CCR when needed
See CCR Guide for full CCR documentation.
Performance¶
| Content Type | Compression | Speed | Accuracy |
|---|---|---|---|
| JSON (large arrays) | 70-90% | ~1ms | Keys preserved |
| Code (Python) | 50-70% | ~10ms | Signatures preserved |
| Plain text | 60-80% | ~5ms | High-entropy preserved |
Overhead: ~1-10ms per compression depending on content size and type.
Installation¶
# Basic compression (fallback to simple compression)
pip install headroom-ai
# With ML detection (recommended)
pip install "headroom-ai[magika]"
# With LLMLingua compression
pip install "headroom-ai[llmlingua]"
# With AST-based code handling
pip install "headroom-ai[code]"
# Everything
pip install "headroom-ai[all]"
Example: Full Pipeline¶
from headroom.compression import UniversalCompressor, UniversalCompressorConfig
# Configure for aggressive compression
config = UniversalCompressorConfig(
compression_ratio_target=0.25, # Keep 25%
use_magika=True,
use_llmlingua=True,
ccr_enabled=True,
)
compressor = UniversalCompressor(config=config)
# Compress JSON API response
json_content = """
{
"users": [
{"id": "usr_123", "name": "Alice", "bio": "Software engineer..."},
{"id": "usr_456", "name": "Bob", "bio": "Product manager..."}
],
"total": 2,
"page": 1
}
"""
result = compressor.compress(json_content)
print(f"Type: {result.content_type}") # ContentType.JSON
print(f"Handler: {result.handler_used}") # json
print(f"Saved: {result.savings_percentage:.0f}%") # ~60%
print(f"Structure: {result.preservation_ratio:.0%} preserved") # ~40%
print(f"CCR Key: {result.ccr_key}") # For retrieval
See Also¶
- Transforms Reference - Other compression transforms
- CCR Guide - Reversible compression architecture
- Text Compression - Opt-in utilities for search/logs