Skip to main content

Ollama (Local Models)

Use Ollama to run open-source models locally with RadarOS. No API key required—ideal for development, privacy-sensitive workloads, and cost-free experimentation.

Setup

1

Install Ollama

Download and install Ollama from ollama.ai. Start the Ollama service (it runs on http://localhost:11434 by default).
2

Pull a model

Pull the model you want to use:
ollama pull llama3.1
ollama pull codellama
ollama pull mistral
3

Use in RadarOS

Use the ollama factory with the model name:
import { ollama } from "@radaros/core";

const model = ollama("llama3.1");

Factory

import { ollama } from "@radaros/core";

const model = ollama("llama3.1");
modelId
string
required
The Ollama model name (e.g., llama3.1, codellama, mistral).
config
object
Optional configuration. See Config below.

Config

host
string
default:"http://localhost:11434"
Ollama server URL. Defaults to http://localhost:11434. Use this for remote Ollama instances or custom ports.
Ollama runs locally and does not require an API key. Just ensure the Ollama service is running.

Example

const model = ollama("llama3.1", {
  host: "http://localhost:11434",
});

// Remote Ollama instance
const remoteModel = ollama("mistral", {
  host: "http://192.168.1.100:11434",
});

ModelUse Case
llama3.1General purpose, strong all-around performance
codellamaCode generation and understanding
mistralFast, efficient, good for chat
mixtralMixture of experts, higher capability
phi3Small, fast, good for edge devices
Run ollama list to see installed models. Browse ollama.com/library for the full catalog.

Multi-Modal Support

Ollama supports image input for vision-capable models like llava, bakllava, and llama3.2-vision.

Images

Pass images as base64 data in ContentPart[]:
import { Agent, ollama, type ContentPart } from "@radaros/core";
import { readFileSync } from "node:fs";

const agent = new Agent({
  name: "VisionBot",
  model: ollama("llava"),
  instructions: "Describe images in detail.",
});

const imageData = readFileSync("photo.jpg").toString("base64");

const result = await agent.run([
  { type: "text", text: "What's in this image?" },
  { type: "image", data: imageData, mimeType: "image/jpeg" },
] as ContentPart[]);

Unsupported: Audio & Files

Audio and file inputs are not supported by Ollama. If passed, the provider logs a warning and skips them.

Tool Calling

Ollama supports function calling with select models. Enable tools on your agent as usual — RadarOS handles the tool call protocol automatically:
import { Agent, ollama, defineTool } from "@radaros/core";
import { z } from "zod";

const agent = new Agent({
  name: "local-assistant",
  model: ollama("llama3.1"), // Supports tool calling
  tools: [
    defineTool({
      name: "getWeather",
      description: "Get current weather for a city",
      parameters: z.object({ city: z.string() }),
      execute: async ({ city }) => `${city}: 22°C, Sunny`,
    }),
  ],
  instructions: "You are a helpful assistant. Use tools when needed.",
});

const result = await agent.run("What's the weather in Paris?");
console.log(result.text);

Models with Tool Support

ModelTool Calling
llama3.1Yes
llama3.2Yes
mistralYes
mixtralYes
codellamaNo
phi3No
llavaNo (vision only)
Tool calling quality varies by model. Larger models (70B+) are more reliable for complex tool use. For production tool-calling workflows, consider llama3.1:70b or a cloud provider.

Performance Tips

GPU Acceleration

Ollama automatically uses GPU when available. Check GPU usage with:
ollama ps  # Shows running models and their GPU/CPU split
For best performance, ensure your model fits entirely in GPU memory. A 7B model typically needs ~4GB VRAM, 13B needs ~8GB, and 70B needs ~40GB.

Context Size

By default, Ollama uses a 2048-token context window. For agents with long conversations or large tool results, increase it:
# In your Modelfile or via the API
ollama run llama3.1 --ctx-size 8192

Model Selection Guidelines

ScenarioRecommended Model
General chat + toolsllama3.1 or llama3.1:70b
Code generationcodellama:13b or codellama:34b
Vision tasksllava:13b or llama3.2-vision
Fast responses (edge)phi3:mini or llama3.2:1b
Complex reasoningmixtral:8x7b or llama3.1:70b

Full Example

import { Agent, ollama } from "@radaros/core";

const agent = new Agent({
  name: "Local Assistant",
  model: ollama("llama3.1"),
  instructions: "You are a helpful assistant running locally.",
});

const output = await agent.run("Explain recursion in one sentence.");
console.log(output.text);