Skip to main content

Observability

@radaros/observability is a separate, opt-in package that adds tracing, metrics, and structured logging to any RadarOS agent. It listens to the agent’s EventBus from the outside — zero changes to core, zero overhead when not installed.
npm install @radaros/observability

Quick Start

import { Agent, openai } from "@radaros/core";
import { instrument } from "@radaros/observability";

const agent = new Agent({
  name: "assistant",
  model: openai("gpt-4o"),
});

// One-liner — just pass exporter names as strings
const obs = instrument(agent, {
  exporters: ["console"],
});

await agent.run("Hello!");

// Access metrics
const m = obs.metrics.getMetrics();
console.log(`Runs: ${m.counters.runs_total}, Tokens: ${m.gauges.total_tokens}`);

// Clean up when done
await obs.tracer.flush();
obs.detach();

Exporter Shorthands

Pass exporter names as strings — credentials are read from env vars automatically:
// Langfuse — reads LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY, LANGFUSE_BASE_URL
instrument(agent, { exporters: ["langfuse"] });

// Multiple exporters
instrument(agent, { exporters: ["langfuse", "console"] });

// OpenTelemetry — reads OTEL_EXPORTER_OTLP_ENDPOINT, OTEL_EXPORTER_OTLP_HEADERS
instrument(agent, { exporters: ["otel"] });

// JSON file — writes to traces-<timestamp>.json
instrument(agent, { exporters: ["json-file"] });
You can also mix shorthands with custom instances when you need to override defaults:
import { instrument, LangfuseExporter } from "@radaros/observability";

instrument(agent, {
  exporters: [
    new LangfuseExporter({ baseUrl: "https://self-hosted.example.com" }),
    "console",
  ],
});

How It Works

The instrument() function attaches three listeners to the agent’s EventBus:
  1. Tracer — builds a span tree from events (run.starttool.calltool.resultrun.complete)
  2. MetricsCollector — counts runs, tool calls, errors, cache hits, and tracks latency histograms
  3. StructuredLogger — emits JSON log entries correlated with trace IDs
Since core already emits rich events for every operation, observability works automatically with all features: handoffs, teams, cost tracking, caching, tools, etc.

Provider Metrics in Traces & Logs

When a run completes, the run.complete event includes providerMetrics — the raw usage object from the underlying model API. This is automatically captured by:
  • Tracer — stored as a span attribute (providerMetrics) on the root run span
  • StructuredLogger — included in the JSON log payload for run.complete events
  • MetricsExporter — stored in the RunRecord for export and dashboard consumption
  • LangfuseExporter — forwarded as generation metadata in Langfuse
This means you get full provider-level transparency (e.g., thoughtsTokenCount, prompt_tokens_details, cache_read_input_tokens) in your observability pipeline without any extra configuration.

Trace Tree

Every agent.run() produces a trace like:
──────────────────────────────────────────
  Trace abc123  duration=1240ms
  agent=assistant
──────────────────────────────────────────
  ✓ agent.run        [0ms → +1240ms]  578 tok
  ├─ ✓ tool.get_weather  [450ms → +35ms]
  ├─ ✓ tool.search       [500ms → +120ms]
──────────────────────────────────────────

Exporters

ShorthandEnv VarsDescription
"console"Pretty-print trace tree to terminal
"langfuse"LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEYNative Langfuse format with generations and spans
"otel"OTEL_EXPORTER_OTLP_ENDPOINTOTLP/HTTP JSON to any OpenTelemetry collector
"json-file"Append traces to a JSON file
Plus CallbackExporter for custom integrations:
import { instrument, CallbackExporter } from "@radaros/observability";

instrument(agent, {
  exporters: [new CallbackExporter((trace) => myCustomSink(trace))],
});

Metrics

const snap = obs.metrics.getMetrics();

snap.counters.runs_total;          // Total runs
snap.counters.runs_success;        // Successful runs
snap.counters.runs_error;          // Failed runs
snap.counters.tool_calls_total;    // Total tool invocations
snap.counters.handoffs_total;      // Agent handoffs
snap.counters.cache_hits;          // Semantic cache hits
snap.counters.cache_misses;        // Semantic cache misses
snap.histograms.run_duration_ms;   // Array of run durations
snap.histograms.tool_latency_ms;   // Array of tool latencies
snap.gauges.total_tokens;          // Total tokens consumed
snap.gauges.total_cost_usd;       // Total cost from CostTracker events
snap.rates.cache_hit_ratio;        // Hits / (hits + misses)
snap.rates.error_rate;             // Errors / total

Structured Logging

Three drain modes:
// JSON to stdout (for log aggregators like Datadog, ELK)
instrument(agent, { exporters: ["console"], structuredLogs: "json" });

// Plain text to stdout
instrument(agent, { exporters: ["console"], structuredLogs: "console" });

// Custom function
instrument(agent, {
  exporters: ["console"],
  structuredLogs: (entry) => myLogger.log(entry),
});
Each entry includes traceId for correlation with traces.

Works With Teams & Workflows

Use instrumentBus() to attach to any EventBus:
import { instrumentBus } from "@radaros/observability";

const team = new Team({ ... });
const obs = instrumentBus(team.eventBus, { exporters: ["langfuse", "console"] });

Langfuse Integration Example

Langfuse provides an open-source LLM observability dashboard. Set up in 3 steps:
# 1. Set environment variables
export LANGFUSE_PUBLIC_KEY="pk-..."
export LANGFUSE_SECRET_KEY="sk-..."
export LANGFUSE_BASE_URL="https://cloud.langfuse.com" # or self-hosted URL
// 2. Instrument your agent
import { Agent, openai } from "@radaros/core";
import { instrument } from "@radaros/observability";

const agent = new Agent({
  name: "assistant",
  model: openai("gpt-4o"),
  instructions: "You are a helpful assistant.",
});

const obs = instrument(agent, {
  exporters: ["langfuse"],
});

// 3. Every run is now traced in Langfuse
await agent.run("What is quantum computing?", {
  sessionId: "session-abc",
  userId: "user-42",
});

// Flush traces before process exit
await obs.tracer.flush();
In the Langfuse dashboard, you’ll see:
  • Traces for each agent.run() with duration, token usage, and cost
  • Generations for each LLM call within a run
  • Spans for tool calls, handoffs, and other operations
  • Sessions grouping traces by sessionId

OpenTelemetry Export

Send traces to any OTLP-compatible backend (Jaeger, Grafana Tempo, Honeycomb, etc.):
export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4318"
export OTEL_EXPORTER_OTLP_HEADERS="Authorization=Bearer my-token"
import { instrument } from "@radaros/observability";

const obs = instrument(agent, {
  exporters: ["otel"],
});
Traces follow the OpenTelemetry semantic conventions for GenAI, making them compatible with standard OTLP tooling.

Building a Custom Dashboard

Combine metrics and events to build a real-time dashboard:
import { Agent, openai, CostTracker } from "@radaros/core";
import { instrument } from "@radaros/observability";

const tracker = new CostTracker();
const agent = new Agent({
  name: "assistant",
  model: openai("gpt-4o"),
  costTracker: tracker,
});

const obs = instrument(agent, { exporters: ["console"] });

// Periodic metrics snapshot
setInterval(() => {
  const m = obs.metrics.getMetrics();
  const cost = tracker.getSummary();

  console.log({
    totalRuns: m.counters.runs_total,
    successRate: (1 - m.rates.error_rate) * 100,
    cacheHitRate: m.rates.cache_hit_ratio * 100,
    avgLatency: average(m.histograms.run_duration_ms),
    totalCost: cost.totalCost,
    totalTokens: m.gauges.total_tokens,
  });
}, 30_000);

function average(arr: number[]): number {
  return arr.length ? arr.reduce((a, b) => a + b, 0) / arr.length : 0;
}

Capacity Metrics

When the Session Profiler is attached to the same EventBus, MetricsExporter automatically captures capacity-related metrics:

AgentMetrics fields

FieldTypeDescription
estimatedKvCacheGbnumber?Estimated total KV cache memory across all sessions
avgContextLengthnumber?Average prompt tokens per run
sessionCategoriesRecord<string, number>?Session counts by category (light/medium/heavy/extreme)

Prometheus output

The toPrometheus() method includes three new capacity counters:
# HELP radaros_kv_cache_estimated_gb Estimated KV cache size in GB
# TYPE radaros_kv_cache_estimated_gb gauge
radaros_kv_cache_estimated_gb 12.5

# HELP radaros_session_category_total Sessions by category
# TYPE radaros_session_category_total counter
radaros_session_category_total{category="light"} 5
radaros_session_category_total{category="medium"} 2
radaros_session_category_total{category="heavy"} 1

# HELP radaros_capacity_sessions_total Total tracked sessions
# TYPE radaros_capacity_sessions_total counter
radaros_capacity_sessions_total 8
These appear automatically when capacity.session.classified and capacity.warning events are emitted on the EventBus — no additional configuration needed.