Skip to main content

Cost Tracking

The CostTracker tracks token usage and costs across agent runs. It supports budget enforcement, per-model pricing, detailed cost breakdowns by category, and per-agent/model/user summaries.

Quick Start

import { Agent, openai, CostTracker } from "@radaros/core";

const tracker = new CostTracker({
  budget: {
    maxCostPerSession: 1.0,   // $1 per session
    maxCostPerUser: 10.0,     // $10 per user
    onBudgetExceeded: "throw",
  },
});

const agent = new Agent({
  name: "assistant",
  model: openai("gpt-4o"),
  costTracker: tracker,
});

await agent.run("Hello!", { sessionId: "s1", userId: "user-42" });

const summary = tracker.getSummary();
console.log(`Total cost: $${summary.totalCost.toFixed(4)}`);

Token Types Tracked

The CostTracker captures all token types returned by the API:
Token TypeFieldDescription
InputpromptTokensTokens in the prompt / input
OutputcompletionTokensTokens generated in the response
ReasoningreasoningTokensTokens used for chain-of-thought (o1, o3)
CachedcachedTokensPrompt tokens served from API cache
Audio InputaudioInputTokensTokens from audio input (Realtime API)
Audio OutputaudioOutputTokensTokens for audio output (Realtime API)
All token types are tracked per-message, per-run, and per-session. The getSummary() method aggregates totals across all tracked dimensions.

Raw Provider Metrics

Every RunOutput.usage object includes a providerMetrics field containing the raw, unmodified usage data returned by the provider API. This gives full transparency without any normalization loss:
const result = await agent.run("Hello!");

// Normalized RadarOS fields
console.log(result.usage.promptTokens);       // 16
console.log(result.usage.completionTokens);   // 10

// Raw provider-specific data (varies by provider)
console.log(result.usage.providerMetrics);
Example providerMetrics by provider:
{
  "prompt_tokens": 16,
  "completion_tokens": 10,
  "total_tokens": 26,
  "prompt_tokens_details": { "cached_tokens": 0 },
  "completion_tokens_details": { "reasoning_tokens": 0 }
}

Cost Breakdown

Each cost entry includes a 6-category breakdown:
interface CostBreakdown {
  input: number;       // Prompt token cost
  output: number;      // Completion token cost
  reasoning: number;   // Reasoning token cost
  cached: number;      // Cached prompt cost (discounted rate)
  audioInput: number;  // Audio input token cost
  audioOutput: number; // Audio output token cost
  total: number;       // Sum of all categories
}
Access per-run or aggregated:
// Per-entry breakdown
const entry = tracker.getEntries()[0];
console.log(entry.breakdown.input);    // $0.000375
console.log(entry.breakdown.output);   // $0.004956
console.log(entry.breakdown.total);    // $0.005331

// Aggregated summary
const summary = tracker.getSummary();
console.log(summary.totalBreakdown.total);       // Total across all entries
console.log(summary.byAgent["assistant"].breakdown); // Per-agent breakdown
console.log(summary.byModel["gpt-4o"].breakdown);   // Per-model breakdown
console.log(summary.byUser["user-42"].breakdown);    // Per-user breakdown

Built-in Pricing

Pricing is included for 50+ models:
ModelPrompt / 1KCompletion / 1K
gpt-4o$0.0025$0.01
gpt-4o-mini$0.00015$0.0006
claude-3.5-sonnet$0.003$0.015
gemini-2.0-flash$0.0001$0.0004
Override or extend pricing:
const tracker = new CostTracker({
  pricing: {
    "my-custom-model": {
      promptPer1k: 0.005,
      completionPer1k: 0.02,
      reasoningPer1k: 0.06,        // Optional: reasoning token pricing
      cachedPromptPer1k: 0.0005,   // Optional: cached prompt pricing
      audioInputPer1k: 0.1,        // Optional: audio input pricing
      audioOutputPer1k: 0.2,       // Optional: audio output pricing
    },
  },
});

Budget Enforcement

Budgets are checked before each LLM call and mid-run during tool-calling loops:
interface CostBudget {
  maxCostPerRun?: number;      // Per individual run
  maxCostPerSession?: number;  // Across all runs in a session
  maxCostPerUser?: number;     // Across all sessions for a user
  maxTokensPerRun?: number;    // Token limit per run
  onBudgetExceeded?: "throw" | "warn";
}
See Cost Auto-Stop for mid-run enforcement details.

Works Across All Agent Types

The same CostTracker instance can be shared across different agent types:
import { Agent, VoiceAgent, openai, openaiRealtime, CostTracker } from "@radaros/core";
import { BrowserAgent } from "@radaros/browser";

const tracker = new CostTracker({
  budget: { maxCostPerUser: 10.0 },
});

// Text agent
const textAgent = new Agent({
  name: "assistant",
  model: openai("gpt-4o"),
  costTracker: tracker,
});

// Voice agent
const voiceAgent = new VoiceAgent({
  name: "voice-assistant",
  provider: openaiRealtime("gpt-4o-realtime-preview"),
  costTracker: tracker,
});

// Browser agent
const browserAgent = new BrowserAgent({
  name: "web-scraper",
  model: openai("gpt-4o"),
  costTracker: tracker,
});

// All three agents report to the same tracker
await textAgent.run("Hello!", { userId: "user-42" });
const session = await voiceAgent.connect({ userId: "user-42" });
await browserAgent.run("Search for flights", { userId: "user-42" });

const summary = tracker.getSummary();
console.log(summary.byAgent);
// { assistant: {...}, "voice-assistant": {...}, "web-scraper": {...} }
console.log(summary.byUser["user-42"].cost); // Combined cost across all agents

Cost Summary

const summary = tracker.getSummary({ userId: "user-42" });

summary.totalCost;              // Total USD
summary.totalTokens;            // Aggregated TokenUsage
summary.totalBreakdown;         // Aggregated CostBreakdown
summary.byAgent["assistant"];   // { cost, breakdown, tokens, runs }
summary.byModel["gpt-4o"];     // { cost, breakdown, tokens }
summary.byUser["user-42"];     // { cost, breakdown, tokens }

Events

EventPayload
cost.tracked{ runId, agentName, modelId, usage, cost }
cost.budget.exceeded{ runId, agentName, budget, current, limit }

Subscribing to Cost Events

Listen for cost events to build dashboards, alerts, or analytics:
import { Agent, openai, CostTracker } from "@radaros/core";

const tracker = new CostTracker({
  budget: { maxCostPerSession: 2.0, onBudgetExceeded: "warn" },
});

const agent = new Agent({
  name: "assistant",
  model: openai("gpt-4o"),
  costTracker: tracker,
});

agent.on("cost.tracked", ({ runId, agentName, modelId, usage }) => {
  console.log(
    `[Cost] ${agentName} / ${modelId}: ` +
    `${usage.promptTokens} prompt + ${usage.completionTokens} completion`
  );
});

agent.on("cost.budget.exceeded", ({ agentName, budget, current, limit }) => {
  console.warn(
    `[Budget] ${agentName} exceeded ${budget}: $${current.toFixed(2)} / $${limit.toFixed(2)}`
  );
});

Per-Agent and Per-Model Breakdown

const tracker = new CostTracker();

const assistantAgent = new Agent({
  name: "assistant",
  model: openai("gpt-4o"),
  costTracker: tracker,
});

const routerAgent = new Agent({
  name: "router",
  model: openai("gpt-4o-mini"),
  costTracker: tracker,
});

await assistantAgent.run("Complex analysis task", { userId: "user-42" });
await routerAgent.run("Route this request", { userId: "user-42" });

const summary = tracker.getSummary();

// Per-agent costs with breakdowns
for (const [name, data] of Object.entries(summary.byAgent)) {
  console.log(`${name}: $${data.cost.toFixed(4)} (${data.runs} runs)`);
  console.log(`  input: $${data.breakdown.input.toFixed(6)}`);
  console.log(`  output: $${data.breakdown.output.toFixed(6)}`);
}

// Per-model costs
for (const [model, data] of Object.entries(summary.byModel)) {
  console.log(`${model}: $${data.cost.toFixed(4)}`);
}

// Per-user costs
for (const [user, data] of Object.entries(summary.byUser)) {
  console.log(`${user}: $${data.cost.toFixed(4)}`);
}

Custom Pricing for Non-Built-in Models

const tracker = new CostTracker({
  pricing: {
    "ft:gpt-4o-mini:my-org:custom:abc123": {
      promptPer1k: 0.0003,
      completionPer1k: 0.0012,
    },
    "llama3.1": {
      promptPer1k: 0,
      completionPer1k: 0,
    },
    "claude-opus-next": {
      promptPer1k: 0.015,
      completionPer1k: 0.075,
      reasoningPer1k: 0.06,
    },
  },
});
If a model has no pricing entry (built-in or custom), the cost is recorded as $0 but token counts are still tracked.

Budget Enforcement in Practice

const tracker = new CostTracker({
  budget: {
    maxCostPerRun: 0.50,
    maxCostPerSession: 2.00,
    maxCostPerUser: 20.00,
    maxTokensPerRun: 50_000,
    onBudgetExceeded: "throw",
  },
});

const agent = new Agent({
  name: "assistant",
  model: openai("gpt-4o"),
  costTracker: tracker,
});

try {
  await agent.run("Analyze this massive dataset...", {
    sessionId: "s1",
    userId: "user-42",
  });
} catch (error) {
  if (error.name === "CostBudgetExceededError") {
    console.log("Budget exceeded — inform the user or switch to a cheaper model");
  }
}
With onBudgetExceeded: "warn", the run continues but emits a cost.budget.exceeded event instead of throwing.

Token Accuracy

RadarOS verifies 100% accuracy between CostTracker recorded tokens and raw API response tokens across all scenarios — simple completion, tool calling, multi-turn memory, and prompt caching. See benchmarks for detailed validation results.

Cross-References