Token-aware Context Compaction
Agents with long conversations or large tool results can exceed the model’s context window. TheContextCompactor automatically manages context size before each LLM call.
Configuration
Strategies
Trim
Drops oldest non-system messages first, keeping the system prompt and most recent exchanges intact.Summarize
Uses a cheap model to summarize older messages into a single compact summary, preserving key context while reducing token count.Hybrid
Trims first, then summarizes if still over budget. Best balance of speed and context preservation.How It Works
The compactor hooks into thebeforeLLMCall loop hook and runs before every LLM API call:
- Estimates token count of all messages
- If under budget, passes through unchanged
- If over budget, applies the configured strategy
- Returns the compacted messages to the LLM