Skip to main content

System Architecture

This page describes the RadarOS architecture: how packages are organized, how layers interact, and how data flows through the system.

Monorepo Structure

RadarOS is organized as a monorepo with four primary packages. Each package has a focused responsibility and can be used independently or together.

Package Overview

PackagePurpose
@radaros/coreAgents, models, tools, memory, storage, voice agents, vector stores, MCP client, A2A client
@radaros/transportExpress REST API, Socket.IO gateway, Voice gateway, Browser gateway, A2A server
@radaros/queueBullMQ background job processing
@radaros/browserVision-based autonomous browser automation with Playwright

Layered Architecture

RadarOS is built in layers. Higher layers depend on lower ones and infrastructure is pluggable.
  1. SDK Layer — Agent, Team, Workflow, VoiceAgent, BrowserAgent. The primary API surface for defining behavior, orchestrating agents, and running workflows.
  2. Engine Layer — LLM Loop, Tool Executor, MemoryManager (sessions, summaries, user facts, user profile, entities, decisions, learnings), SkillManager. Core execution logic with automatic retry, tool caching, token-based history trimming, reasoning, and cross-session personalization.
  3. Safety Layer — Sandbox (isolated subprocess execution with timeout and memory limits), Approval Manager (human-in-the-loop gating before tool execution), Guardrails (input/output validation).
  4. Model Abstraction — ModelProvider interface and adapters for text models. RealtimeProvider interface for voice/streaming models. Factory functions: openai(), anthropic(), google(), ollama(), vertex(), openaiRealtime(), googleLive().
  5. Protocol Integration — MCP Client for consuming external tools, A2A Client for calling remote agents.
  6. Infrastructure — Storage (in-memory, SQLite, PostgreSQL, MongoDB), Vector Stores, and Embeddings. All pluggable.
  7. Registry & Auto-Discovery — Agents, Teams, and Workflows auto-register into a global Registry on construction. Transport layers read from the registry dynamically, so entities created at any time are immediately available over HTTP and WebSocket without restart or re-wiring.
  8. Transport (Optional) — Express REST, Socket.IO WebSocket, Voice Gateway (real-time audio streaming), Browser Gateway (live browser observation), and A2A Server. Uses the Registry for live auto-discovery of agents, teams, and workflows.
  9. Queue (Optional) — BullMQ workers for background job processing.

Data Flow — Text Agent

A typical text agent request flows through the system as follows:
User Input

Agent.run() / Agent.stream()

buildMessages (history + system instructions + MemoryManager.buildContext() + skill instructions)

LLM Loop (with retry)

ModelProvider (OpenAI / Anthropic / Google / Ollama / Vertex)

Response (text / tool calls)

Tool Executor (if tool calls)
  ├── Approval check (if requiresApproval is set)
  ├── Sandbox execution (if sandbox is enabled)
  ├── Local tools (with optional caching)
  ├── MCP tools (external servers)
  └── A2A tools (remote agents)

Loop until final response

MemoryManager.appendMessages() → auto-summarize overflow

MemoryManager.afterRun() → fire-and-forget extraction
  (user facts, user profile, entities, learnings)

Output to caller

Detailed Flow

  1. User Input — A string or multi-modal content (text, images, files).
  2. Agent — Receives input, loads session history from MemoryManager, injects memory context and skill instructions into the system prompt.
  3. buildMessages — Constructs the message array: system prompt (with summaries, user facts, user profile, entities, decisions, learnings, skill instructions), session history (auto-trimmed if maxTokens is set), current user message.
  4. LLM Loop — Sends messages to the model with automatic retry on transient failures (429, 5xx, network errors).
  5. ModelProvider — Translates to the provider API format.
  6. Response — Either text or tool calls.
  7. Tool Executor — If tool calls:
    • Checks human approval if requiresApproval is set on the tool or agent.
    • Runs the tool in a sandboxed subprocess if sandbox is enabled.
    • Executes the tool, appends results, and loops back to the model.
  8. MemoryManager.appendMessages — Persists the new turn to session storage and auto-summarizes overflow.
  9. MemoryManager.afterRun — Asynchronously extracts user facts, user profile, entities, and learnings from the conversation for future personalization.
  10. Output — Returns or streams the final response to the caller.

Data Flow — Voice Agent

Audio Input (WebSocket / Socket.IO)

VoiceAgent.connect()

RealtimeProvider (OpenAI Realtime / Google Live)

Bidirectional audio stream

Tool calls (if any) → Tool Executor

MemoryManager.appendMessages() (session persistence)

MemoryManager.afterRun() (non-blocking extraction)

Audio Output → Client
The VoiceAgent manages:
  • VoiceSession — wraps the realtime provider connection, routes tool calls, emits events.
  • Session persistence — conversation history saved via MemoryManager, restored on reconnect.
  • Memory extraction — user facts, profile, entities, and learnings extracted from voice transcripts (non-blocking).

Data Flow — Browser Agent

Task (string)

BrowserAgent.run()

Launch Playwright (with stealth config + humanize settings)

Screenshot → ModelProvider (vision)

LLM decides action (click, type, scroll, navigate, done, fail)

BrowserProvider executes action
  ├── CredentialVault resolves {{placeholders}} for type actions
  ├── DOM extraction (optional, for hybrid vision+DOM approach)
  └── Loop detection (maxRepeats threshold)

Screenshot → next iteration

Loop until "done" or "fail" or maxSteps reached

Close browser (with optional cookie/auth persistence)

Output result + action history
The BrowserAgent supports:
  • Stealth mode — patches navigator.webdriver, WebGL, plugins to avoid bot detection.
  • Humanize mode — random delays, mouse movement curves, typing variation.
  • Credential vault — secrets never reach the LLM; only {{placeholders}} are used.
  • Video recording — Playwright-native recording of browser sessions.
  • Parallel browsing — multiple pages/tabs via BrowserProvider.
  • Cookie persistence — save and restore storageState across runs.

Event System

All agents emit typed events via the EventBus. This enables logging, analytics, transport integration, and custom middleware without coupling.
EventEmitted by
run.start, run.complete, run.errorAgent
run.stream.chunkAgent (streaming)
tool.call, tool.result, tool.errorTool Executor
tool.approval.request, tool.approval.responseApproval Manager
voice.session.start, voice.session.endVoiceAgent
voice.tool.call, voice.tool.resultVoiceSession
browser.step, browser.action, browser.done, browser.errorBrowserAgent
memory.extract, memory.stored, memory.errorMemoryManager
skill.loaded, skill.learnedSkillManager

Memory Architecture

RadarOS provides a unified memory system through MemoryManager. A single memory config works identically across Agent, VoiceAgent, and BrowserAgent.
StoreScopeDefaultPurpose
SessionsPer-sessionONMessage history, auto-trimmed by maxMessages or maxTokens.
SummariesPer-sessionONLLM-generated summaries of overflow messages for long-term context.
User FactsPer-user, cross-sessionOFFExtracted facts — “prefers dark mode”, “lives in Mumbai”.
User ProfilePer-user, cross-sessionOFFStructured data — name, role, company, timezone.
Entity MemoryGlobal / per-namespaceOFFCompanies, people, projects with facts, events, relationships.
Decision LogPer-agentOFFAudit trail of agent decisions — what, why, outcome.
Learned KnowledgeGlobal (vector-backed)OFFReusable insights discovered during conversations.
All stores share a single StorageDriver (InMemory, SQLite, PostgreSQL, MongoDB). All extraction is non-blocking (fire-and-forget).

Skills Architecture

Skills are pre-packaged tool bundles loaded from local directories, npm packages, or remote URLs. The SkillManager orchestrates loading and provides lazy initialization (loaded on first run, not at construction).
FeatureDescription
Pre-packaged SkillsLocal, npm, or remote tool bundles with manifests.
Learned SkillsAgent-saved multi-step tool call patterns for replay.
Lazy LoadingSkills loaded on first run(), not at construction.
Instruction InjectionSkill instructions auto-injected into system prompt.

Registry & Auto-Discovery

RadarOS includes a global Registry singleton. Every Agent, Team, and Workflow automatically registers itself on construction (unless register: false is set).
import { Agent, openai, registry } from "@radaros/core";

new Agent({ name: "bot", model: openai("gpt-4o") });

registry.list();
// { agents: ["bot"], teams: [], workflows: [] }
The Express router and Socket.IO gateway read from this registry at request time. Agents created after the transport layer starts become available immediately — no restart or re-wiring needed.
FeatureDescription
Auto-registerInstances register on construction. Opt out with register: false.
kind discriminantEach class has a readonly kind ("agent", "team", "workflow") for reliable runtime type identification.
Dynamic routingTransport routes resolve by name from the registry on each request.
List endpointsGET /agents, GET /teams, GET /workflows return metadata. GET /registry returns all names.
Custom registriesPass a custom Registry instance to createAgentRouter() or createAgentGateway() for isolated scoping.

Performance Optimizations

OptimizationImpact
Tool schema cachingTool definitions are converted to JSON Schema once at construction, not on every LLM roundtrip.
Minimal schema serializationStrips verbose JSON Schema fields ($schema, additionalProperties) to reduce token overhead.
Strict modeOptional strict: true on tools enables OpenAI Structured Outputs for guaranteed valid JSON.
Session read deduplicationSession data is loaded once per run/stream call and reused for both context and history.
Non-blocking memory extractionAll memory extraction (facts, profile, entities, learnings) runs in background without blocking.
Token-based history trimmingmaxContextTokens auto-trims history (oldest first) to prevent context window overflow.
Automatic retryTransient LLM API failures (429, 5xx, network errors) are retried with exponential backoff.
Streaming usage trackingToken usage is accurately tracked in both run and stream modes.
Sandbox subprocess poolingSandboxed tools run in isolated child processes without affecting the main event loop.

Core Design Principles

  • Zero Meta-Framework Dependency — No Next.js, Remix, or framework-specific runtime. Use RadarOS with any Node.js server or headless.
  • Optional Peer Dependencies — Providers (openai, anthropic, etc.) are peer dependencies. Lazy-loaded so you only bundle what you use.
  • Event-Driven — EventBus emits lifecycle events. Subscribe for logging, analytics, or custom middleware.
  • Pluggable Everything — Storage, models, vector stores, and transport are all swappable. Configure once, change later without rewriting logic.
  • Safety by Default — Sandbox execution and human-in-the-loop approval are opt-in per tool or agent-wide. Guardrails validate input and output.
  • Open Protocol Support — MCP for tool integration and A2A for agent interoperability. Connect to the broader AI ecosystem without vendor lock-in.
  • Production Resilient — Automatic retry with exponential backoff, token-based context trimming, and non-blocking background operations ensure reliability at scale.