Overview
85% of agent traffic is simple queries hitting expensive models. The ModelRouter classifies incoming requests by complexity and routes them to the cheapest model that can handle them, achieving 40–85% cost savings.Quick Start
Built-in Complexity Classifier
The built-in classifier runs in under 1ms with no LLM call. It scores on:| Signal | Weight | Example |
|---|---|---|
| Token count | +0.05–0.15 | Long prompts → higher complexity |
| Code markers | +0.15 | Code fences, function, class, import |
| Reasoning keywords | +0.20 | ”analyze”, “compare”, “step by step” |
| Complex instructions | +0.15 | ”build a system”, “design an API” |
| Tool count | +0.08–0.15 | More tools → harder routing |
| Structured output | +0.10 | JSON schema responses |
| Multi-turn depth | +0.05–0.10 | Deep conversations |
Custom Routing Rules
Override the classifier with explicit rules:Outcome Tracking
WhenoutcomeTracking: true, the router logs success/failure per tier in a ring buffer:
Events
| Event | Payload |
|---|---|
model.routed | { tier, complexity, modelId } |