TypeScript SDK¶
The Headroom TypeScript SDK lets any JavaScript or TypeScript application compress LLM messages before sending them to a model. It saves tokens, reduces costs, and fits more context into every request.
Install¶
Requires a running Headroom proxy or Headroom Cloud API key.
Quick Start¶
import { compress } from 'headroom-ai';
const result = await compress(messages, { model: 'gpt-4o' });
console.log(`Saved ${result.tokensSaved} tokens`);
const response = await openai.chat.completions.create({
model: 'gpt-4o',
messages: result.messages,
});
How It Works¶
The TypeScript SDK is an HTTP client. When you call compress(), it sends your messages to the Headroom proxy's POST /v1/compress endpoint. The proxy runs the full compression pipeline (SmartCrusher, ContentRouter, CacheAligner, etc.) and returns compressed messages. No compression logic runs in Node.js — all the heavy lifting happens in the proxy.
Your TypeScript App
│
│ compress(messages)
▼
headroom-ai (npm) ← HTTP client
│
│ POST /v1/compress
▼
Headroom Proxy / Cloud ← compression pipeline (Python)
│
│ compressed messages
▼
Your TypeScript App
│
│ openai.chat.completions.create(compressed)
▼
LLM Provider
Core API: compress()¶
import { compress } from 'headroom-ai';
const result = await compress(messages, {
model: 'gpt-4o', // model name (for token counting)
baseUrl: 'http://localhost:8787', // proxy URL (default)
apiKey: 'hr_...', // Headroom Cloud key
timeout: 30000, // ms (default)
fallback: true, // return uncompressed if proxy down (default)
retries: 1, // retry on transient errors (default)
});
result.messages // compressed messages (same format as input)
result.tokensBefore // original token count
result.tokensAfter // compressed token count
result.tokensSaved // tokens removed
result.compressionRatio // tokensAfter / tokensBefore
result.transformsApplied // e.g. ['router:smart_crusher:0.35']
result.compressed // false if fallback kicked in
Messages use standard OpenAI chat format: { role, content, tool_calls?, tool_call_id? }.
Environment Variables¶
Instead of passing options, set environment variables:
HEADROOM_BASE_URL— proxy or cloud URL (default:http://localhost:8787)HEADROOM_API_KEY— Headroom Cloud API key
Reusable Client¶
For apps making many calls, create a client once and reuse it:
import { HeadroomClient } from 'headroom-ai';
const client = new HeadroomClient({
baseUrl: 'http://localhost:8787',
apiKey: 'hr_...',
});
const r1 = await client.compress(messages1, { model: 'gpt-4o' });
const r2 = await client.compress(messages2, { model: 'gpt-4o' });
Framework Adapters¶
Vercel AI SDK¶
The Headroom middleware plugs directly into Vercel AI SDK's wrapLanguageModel():
import { headroomMiddleware } from 'headroom-ai/vercel-ai';
import { wrapLanguageModel, generateText } from 'ai';
import { openai } from '@ai-sdk/openai';
const model = wrapLanguageModel({
model: openai('gpt-4o'),
middleware: headroomMiddleware(),
});
// All calls through this model are automatically compressed
const { text } = await generateText({ model, messages });
The middleware intercepts messages in the transformParams hook, converts Vercel's internal format to OpenAI format, compresses via the proxy, and converts back. Your app code doesn't change.
You can also compress Vercel messages directly:
import { compressVercelMessages } from 'headroom-ai/vercel-ai';
const result = await compressVercelMessages(modelMessages, { model: 'gpt-4o' });
// result.messages is in Vercel ModelMessage[] format
OpenAI SDK¶
Wrap your OpenAI client to auto-compress messages on every chat.completions.create() call:
import { withHeadroom } from 'headroom-ai/openai';
import OpenAI from 'openai';
const client = withHeadroom(new OpenAI());
// Messages are compressed before sending — transparent to your code
const response = await client.chat.completions.create({
model: 'gpt-4o',
messages: longConversation,
});
Only chat.completions.create() is intercepted. All other methods (embeddings, images, audio) pass through unchanged.
Anthropic SDK¶
Same pattern for the Anthropic client:
import { withHeadroom } from 'headroom-ai/anthropic';
import Anthropic from '@anthropic-ai/sdk';
const client = withHeadroom(new Anthropic());
const response = await client.messages.create({
model: 'claude-sonnet-4-5-20250929',
messages: longConversation,
max_tokens: 1024,
});
Only messages.create() is intercepted. The adapter converts between Anthropic's content block format and OpenAI format automatically.
Error Handling¶
import { compress, HeadroomConnectionError, HeadroomAuthError } from 'headroom-ai';
try {
const result = await compress(messages, { model: 'gpt-4o', fallback: false });
} catch (error) {
if (error instanceof HeadroomAuthError) {
// Invalid API key (401)
} else if (error instanceof HeadroomConnectionError) {
// Proxy unreachable
}
}
With fallback: true (the default), connection errors and 5xx responses return the original messages uncompressed instead of throwing. Auth errors (401) and client errors (400) always throw.
Fallback Behavior¶
By default, compress() never blocks your app. If the proxy is unreachable:
| Scenario | fallback: true (default) |
fallback: false |
|---|---|---|
| Proxy unreachable | Returns uncompressed, compressed: false |
Throws HeadroomConnectionError |
| Proxy 503 error | Returns uncompressed after retries | Throws HeadroomCompressError |
| Invalid API key (401) | Throws HeadroomAuthError |
Throws HeadroomAuthError |
| Bad request (400) | Throws HeadroomCompressError |
Throws HeadroomCompressError |
Zero Dependencies¶
The headroom-ai package has no runtime dependencies. Framework SDKs (Vercel AI, OpenAI, Anthropic) are optional peer dependencies — only install what you use.
OpenClaw Plugin¶
The TypeScript SDK powers the headroom-openclaw plugin for OpenClaw agents. The plugin uses HeadroomClient internally to compress context during the assemble() lifecycle hook. Install it with openclaw plugins install headroom-openclaw. See the plugin source for details.
Comparison with Python SDK¶
| Feature | Python SDK | TypeScript SDK |
|---|---|---|
compress() |
Native (runs locally) | HTTP client (calls proxy) |
| Proxy | Built-in server | Connects to proxy |
| Vercel AI SDK | N/A | Middleware adapter |
| OpenAI SDK | HeadroomClient wrapper |
withHeadroom() wrapper |
| Anthropic SDK | HeadroomClient wrapper |
withHeadroom() wrapper |
| LangChain | HeadroomChatModel |
Use compress() directly |
| Memory system | Full (SQLite + HNSW) | Not yet (use proxy) |
| MCP server | Built-in | Not yet |
| CLI tools | headroom proxy, headroom wrap, etc. |
N/A (use Python CLI) |