Skip to content

TypeScript SDK

The Headroom TypeScript SDK lets any JavaScript or TypeScript application compress LLM messages before sending them to a model. It saves tokens, reduces costs, and fits more context into every request.

Install

npm install headroom-ai

Requires a running Headroom proxy or Headroom Cloud API key.

Quick Start

import { compress } from 'headroom-ai';

const result = await compress(messages, { model: 'gpt-4o' });
console.log(`Saved ${result.tokensSaved} tokens`);

const response = await openai.chat.completions.create({
  model: 'gpt-4o',
  messages: result.messages,
});

How It Works

The TypeScript SDK is an HTTP client. When you call compress(), it sends your messages to the Headroom proxy's POST /v1/compress endpoint. The proxy runs the full compression pipeline (SmartCrusher, ContentRouter, CacheAligner, etc.) and returns compressed messages. No compression logic runs in Node.js — all the heavy lifting happens in the proxy.

Your TypeScript App
    │  compress(messages)
headroom-ai (npm)  ← HTTP client
    │  POST /v1/compress
Headroom Proxy / Cloud  ← compression pipeline (Python)
    │  compressed messages
Your TypeScript App
    │  openai.chat.completions.create(compressed)
LLM Provider

Core API: compress()

import { compress } from 'headroom-ai';

const result = await compress(messages, {
  model: 'gpt-4o',                      // model name (for token counting)
  baseUrl: 'http://localhost:8787',      // proxy URL (default)
  apiKey: 'hr_...',                      // Headroom Cloud key
  timeout: 30000,                        // ms (default)
  fallback: true,                        // return uncompressed if proxy down (default)
  retries: 1,                            // retry on transient errors (default)
});

result.messages          // compressed messages (same format as input)
result.tokensBefore      // original token count
result.tokensAfter       // compressed token count
result.tokensSaved       // tokens removed
result.compressionRatio  // tokensAfter / tokensBefore
result.transformsApplied // e.g. ['router:smart_crusher:0.35']
result.compressed        // false if fallback kicked in

Messages use standard OpenAI chat format: { role, content, tool_calls?, tool_call_id? }.

Environment Variables

Instead of passing options, set environment variables:

  • HEADROOM_BASE_URL — proxy or cloud URL (default: http://localhost:8787)
  • HEADROOM_API_KEY — Headroom Cloud API key

Reusable Client

For apps making many calls, create a client once and reuse it:

import { HeadroomClient } from 'headroom-ai';

const client = new HeadroomClient({
  baseUrl: 'http://localhost:8787',
  apiKey: 'hr_...',
});

const r1 = await client.compress(messages1, { model: 'gpt-4o' });
const r2 = await client.compress(messages2, { model: 'gpt-4o' });

Framework Adapters

Vercel AI SDK

The Headroom middleware plugs directly into Vercel AI SDK's wrapLanguageModel():

import { headroomMiddleware } from 'headroom-ai/vercel-ai';
import { wrapLanguageModel, generateText } from 'ai';
import { openai } from '@ai-sdk/openai';

const model = wrapLanguageModel({
  model: openai('gpt-4o'),
  middleware: headroomMiddleware(),
});

// All calls through this model are automatically compressed
const { text } = await generateText({ model, messages });

The middleware intercepts messages in the transformParams hook, converts Vercel's internal format to OpenAI format, compresses via the proxy, and converts back. Your app code doesn't change.

You can also compress Vercel messages directly:

import { compressVercelMessages } from 'headroom-ai/vercel-ai';

const result = await compressVercelMessages(modelMessages, { model: 'gpt-4o' });
// result.messages is in Vercel ModelMessage[] format

OpenAI SDK

Wrap your OpenAI client to auto-compress messages on every chat.completions.create() call:

import { withHeadroom } from 'headroom-ai/openai';
import OpenAI from 'openai';

const client = withHeadroom(new OpenAI());

// Messages are compressed before sending — transparent to your code
const response = await client.chat.completions.create({
  model: 'gpt-4o',
  messages: longConversation,
});

Only chat.completions.create() is intercepted. All other methods (embeddings, images, audio) pass through unchanged.

Anthropic SDK

Same pattern for the Anthropic client:

import { withHeadroom } from 'headroom-ai/anthropic';
import Anthropic from '@anthropic-ai/sdk';

const client = withHeadroom(new Anthropic());

const response = await client.messages.create({
  model: 'claude-sonnet-4-5-20250929',
  messages: longConversation,
  max_tokens: 1024,
});

Only messages.create() is intercepted. The adapter converts between Anthropic's content block format and OpenAI format automatically.

Error Handling

import { compress, HeadroomConnectionError, HeadroomAuthError } from 'headroom-ai';

try {
  const result = await compress(messages, { model: 'gpt-4o', fallback: false });
} catch (error) {
  if (error instanceof HeadroomAuthError) {
    // Invalid API key (401)
  } else if (error instanceof HeadroomConnectionError) {
    // Proxy unreachable
  }
}

With fallback: true (the default), connection errors and 5xx responses return the original messages uncompressed instead of throwing. Auth errors (401) and client errors (400) always throw.

Fallback Behavior

By default, compress() never blocks your app. If the proxy is unreachable:

Scenario fallback: true (default) fallback: false
Proxy unreachable Returns uncompressed, compressed: false Throws HeadroomConnectionError
Proxy 503 error Returns uncompressed after retries Throws HeadroomCompressError
Invalid API key (401) Throws HeadroomAuthError Throws HeadroomAuthError
Bad request (400) Throws HeadroomCompressError Throws HeadroomCompressError

Zero Dependencies

The headroom-ai package has no runtime dependencies. Framework SDKs (Vercel AI, OpenAI, Anthropic) are optional peer dependencies — only install what you use.

OpenClaw Plugin

The TypeScript SDK powers the headroom-openclaw plugin for OpenClaw agents. The plugin uses HeadroomClient internally to compress context during the assemble() lifecycle hook. Install it with openclaw plugins install headroom-openclaw. See the plugin source for details.

Comparison with Python SDK

Feature Python SDK TypeScript SDK
compress() Native (runs locally) HTTP client (calls proxy)
Proxy Built-in server Connects to proxy
Vercel AI SDK N/A Middleware adapter
OpenAI SDK HeadroomClient wrapper withHeadroom() wrapper
Anthropic SDK HeadroomClient wrapper withHeadroom() wrapper
LangChain HeadroomChatModel Use compress() directly
Memory system Full (SQLite + HNSW) Not yet (use proxy)
MCP server Built-in Not yet
CLI tools headroom proxy, headroom wrap, etc. N/A (use Python CLI)