Context Budget
When multiple memory stores are enabled, the combined context string injected into the system prompt can grow large. The context budget system caps the total token count and distributes the budget proportionally across memory sections so the most important context always makes it in.The Problem
A fully-loaded memory config (summaries, user facts, entities, learnings, graph, decisions, procedures) can produce thousands of tokens of context. Without a budget, all of it is injected — potentially blowing past the model’s context window or crowding out the actual conversation.Configuration
AddcontextBudget to your memory config:
maxTokens is set, buildContext() allocates tokens to each section based on its priority weight. Sections that exceed their allocation are trimmed line-by-line; sections that fit are included in full.
Default Priorities
Each memory section has a default priority that determines what share of the budget it receives. Higher values get more tokens.| Section | Default Priority | Purpose |
|---|---|---|
summaries | 0.25 | Conversation history summaries |
userProfile | 0.15 | Structured user data |
userFacts | 0.15 | Discrete user preferences and facts |
entities | 0.15 | Companies, people, projects |
graph | 0.10 | Knowledge graph nodes |
decisions | 0.10 | Recent agent decision audit trail |
learnings | 0.05 | Vector-backed insights |
procedures | 0.05 | Recorded tool-call workflows |
How Budget Allocation Works
buildContext() follows these steps:
- Gather — Fetch context strings from every enabled store.
- Measure — Count the tokens in each section.
- Check — If total tokens are under
maxTokens, return everything as-is. - Allocate — Assign each section a token budget proportional to its priority weight.
- Trim — Sort sections by priority (lowest first). Starting from the highest priority, include sections that fit. If a section exceeds its remaining budget, trim it line-by-line until it fits. Sections below the cutoff are dropped entirely.
- Assemble — Join the surviving sections in priority order (highest first).
Custom Priorities
Override any priority to shift the budget toward what matters most for your use case:Priority presets by use case
| Use Case | Boost | Reduce |
|---|---|---|
| Knowledge-heavy (research, RAG) | learnings: 0.30 | summaries: 0.10 |
| Audit-focused (compliance, finance) | decisions: 0.30 | learnings: 0.05 |
| CRM / relationship | userFacts: 0.25, entities: 0.25 | procedures: 0.02 |
| Long conversations | summaries: 0.40 | graph: 0.05 |
Inspecting Token Usage
CallbuildContext() directly and measure the result to see exactly how many tokens are being used:
maxTokens — start with a generous budget, inspect the output, then tighten it based on what you actually see.
Without a Budget
If you omitcontextBudget, all sections are concatenated without any trimming. This is fine when memory stores are small or the model has a large context window, but you should add a budget once context grows beyond a few thousand tokens.
Cross-References
- Memory Overview — How the full memory system works
- Memory Stores — Deep dive into each store type
- Simplified API —
remember/recall/forget - Cost Tracking — Monitor token spend across memory extraction