Edge Runtime
Quick Start
Presets
Features
Watchdog
Resource Monitor
Health Endpoint
Config
GPU Monitoring
GPU Snapshot Fields
Connecting to Capacity Planning

Edge Runtime

The EdgeRuntime manages an agent on constrained hardware with automatic watchdog restarts, resource monitoring, health endpoints, and graceful degradation.

Quick Start

import { Agent, ollama } from "@radaros/core";
import { EdgeRuntime, SystemToolkit, edgePreset } from "@radaros/edge";

const preset = edgePreset("pi5-8gb");

const agent = new Agent({
  name: "pi-agent",
  model: ollama(preset.recommendedModel),
  instructions: "You are a Raspberry Pi assistant.",
  tools: [...new SystemToolkit().getTools()],
});

const runtime = new EdgeRuntime({
  preset,
  agent,
  healthPort: 9090,
});

await runtime.start();

// Signal agent activity to prevent watchdog restarts
runtime.heartbeat();

// Check status
const status = runtime.getStatus();
console.log(status.state); // "running" | "degraded" | "stopped"

// Shutdown
await runtime.stop();

Presets

Use edgePreset(id) to get optimized defaults for your device:

import { edgePreset, listEdgePresets, customEdgePreset } from "@radaros/edge";

const presets = listEdgePresets();
// [{ id: "pi4-2gb", label: "..." }, { id: "pi4-4gb", label: "..." }, ...]

const preset = edgePreset("pi5-8gb");
// { recommendedModel: "phi3:mini", maxTokens: 2048, contextWindow: 16384, ... }

// Customize a preset
const custom = customEdgePreset("pi5-8gb", { maxTokens: 4096 });

Preset	Model	Max Tokens	Context	Memory Limit
`pi4-2gb`	tinyllama:1.1b	256	2048	512 MB
`pi4-4gb`	tinyllama:1.1b	512	4096	1024 MB
`pi4-8gb`	llama3.2:1b	1024	8192	2048 MB
`pi5-4gb`	llama3.2:1b	1024	8192	1536 MB
`pi5-8gb`	phi3:mini	2048	16384	3072 MB

Features

Watchdog

Automatically detects unresponsive agents. If no heartbeat() call is received within the timeout window, the runtime emits a watchdog-restart event.

runtime.on("watchdog-restart", ({ reason, restarts }) => {
  console.log(`Watchdog triggered: ${reason} (${restarts} total)`);
  // Recreate or restart your agent here
});

Resource Monitor

Periodically checks CPU temperature, memory, and disk usage. Emits warnings when thresholds are exceeded.

runtime.on("thermal-warning", ({ temperature, threshold }) => {
  console.log(`CPU at ${temperature}°C (threshold: ${threshold}°C)`);
});

runtime.on("memory-warning", ({ usage_percent, threshold }) => {
  console.log(`Memory at ${usage_percent}% (threshold: ${threshold}%)`);
});

runtime.on("recovered", () => {
  console.log("Resources back to normal");
});

Health Endpoint

A lightweight HTTP server on port 9090 (configurable) responds to GET /health:

{
  "state": "running",
  "uptime_ms": 3600000,
  "watchdog_restarts": 0,
  "resources": { "cpu": { ... }, "memory": { ... }, "disk": { ... } },
  "degraded_reason": null
}

Config

preset

string | EdgePreset

required

Device preset ID or custom preset object.

agent

Agent

required

The agent instance to manage.

healthPort

number

default:"9090"

Port for the health check HTTP server.

disableHealthCheck

boolean

default:"false"

Disable the health endpoint entirely.

GPU Monitoring

The ResourceMonitor automatically detects NVIDIA GPUs via nvidia-smi and includes GPU metrics in every snapshot:

import { ResourceMonitor } from "@radaros/edge";

const monitor = new ResourceMonitor({ intervalMs: 5000 });

monitor.on("snapshot", (snap) => {
  if (snap.gpu) {
    console.log(`GPU: ${snap.gpu.name}`);
    console.log(`Memory: ${snap.gpu.memoryUsedGb.toFixed(1)}/${snap.gpu.memoryTotalGb.toFixed(1)} GB`);
    console.log(`Utilization: ${snap.gpu.utilizationPercent}%`);
    console.log(`Temperature: ${snap.gpu.temperatureC}°C`);
  }
});

monitor.on("gpu-warning", (data) => {
  console.log(`GPU HBM pressure: ${data.memoryUsedGb.toFixed(1)}/${data.memoryTotalGb.toFixed(1)} GB`);
});

monitor.start();

GPU Snapshot Fields

Field	Type	Description
`gpu.name`	`string`	GPU model name (e.g. “NVIDIA H100 SXM”)
`gpu.memoryUsedGb`	`number`	Used HBM in GB
`gpu.memoryTotalGb`	`number`	Total HBM in GB
`gpu.utilizationPercent`	`number`	GPU compute utilization (0–100)
`gpu.temperatureC`	`number`	GPU temperature in Celsius

The gpu-warning event fires when GPU memory usage exceeds the memoryThreshold (default 85%). GPU monitoring is automatic — if nvidia-smi is not available (e.g. on CPU-only machines), the gpu field is simply omitted from snapshots.

Connecting to Capacity Planning

The GPU snapshot data can be combined with the Capacity Planning module to compare actual GPU usage against theoretical capacity:

import { ResourceMonitor } from "@radaros/edge";
import {
  planCapacity, SessionProfiler,
  DEFAULT_ARCHITECTURES, DEFAULT_GPU_SPECS,
} from "@radaros/core";

const monitor = new ResourceMonitor({ intervalMs: 10_000 });

monitor.on("snapshot", (snap) => {
  if (!snap.gpu) return;

  // Real GPU data
  const freeGpuGb = snap.gpu.memoryTotalGb - snap.gpu.memoryUsedGb;

  // Theoretical capacity for this hardware
  const plan = planCapacity(
    DEFAULT_ARCHITECTURES["llama-3.1-70b"],
    {
      gpu: DEFAULT_GPU_SPECS["rtx-a5000"],
      gpuCount: 8,
      nandPerGpuGb: 0,
      nandBandwidthGBs: 7,
    },
    { extreme: 1, heavy: 2, medium: 3, light: 4 },
    "fp8", "int4",
  );

  console.log(`Actual free GPU memory: ${freeGpuGb.toFixed(1)} GB`);
  console.log(`Theoretical free for KV: ${plan.freeHbmForKvGb} GB`);
  console.log(`Utilization: ${snap.gpu.utilizationPercent}%`);
});

monitor.start();

When paired with the Session Profiler on the same EventBus, you get a complete picture: real GPU usage from nvidia-smi, real token counts from the LLM API, and theoretical capacity limits — all feeding into the same Prometheus metrics.

Bluetooth BLE Ollama on Edge

Getting Started

Agents

Memory

Skills

Handoff

Cost Tracking

Semantic Cache

Eval Framework

Compliance & Audit

Culture System

Webhooks

Capacity Planning

Observability

Voice Agents

Browser Agents

Models

Teams

Workflows

Storage

Knowledge & RAG

Toolkits

MCP (Model Context Protocol)

A2A (Agent-to-Agent)

Edge & IoT

Transport

Queue

Scheduling

Advanced Features

Edge Runtime

Edge Runtime

Quick Start

Presets

Features

Watchdog

Resource Monitor

Health Endpoint

Config

GPU Monitoring

GPU Snapshot Fields

Connecting to Capacity Planning

Getting Started

Agents

Memory

Skills

Handoff

Cost Tracking

Semantic Cache

Eval Framework

Compliance & Audit

Culture System

Webhooks

Capacity Planning

Observability

Voice Agents

Browser Agents

Models

Teams

Workflows

Storage

Knowledge & RAG

Toolkits

MCP (Model Context Protocol)

A2A (Agent-to-Agent)

Edge & IoT

Transport

Queue

Scheduling

Advanced Features

​Edge Runtime

​Quick Start

​Presets

​Features

​Watchdog

​Resource Monitor

​Health Endpoint

​Config

​GPU Monitoring

​GPU Snapshot Fields

​Connecting to Capacity Planning

Edge Runtime

Quick Start

Presets

Features

Watchdog

Resource Monitor

Health Endpoint

Config

GPU Monitoring

GPU Snapshot Fields

Connecting to Capacity Planning