Ollama¶

The Ollama provider connects Missy to locally-running models via the Ollama REST API. No API key required -- everything runs on your hardware.

Setup¶

Step 1: Install Ollama¶

curl -fsSL https://ollama.com/install.sh | sh

Step 2: Pull a model¶

ollama pull llama3.2

Step 3: Configure Missy¶

providers:
  ollama:
    name: ollama
    model: "llama3.2"
    enabled: true

That is it. No API key, no network policy changes for the default local setup.

Step 4: Verify¶

missy providers
missy ask "Hello" --provider ollama

Configuration¶

providers:
  ollama:
    name: ollama
    model: "llama3.2"
    base_url: "http://localhost:11434"    # Default Ollama address
    timeout: 60                           # Local inference can be slower
    enabled: true

Remote Ollama server¶

If Ollama runs on a different machine:

providers:
  ollama:
    name: ollama
    model: "llama3.2"
    base_url: "http://10.0.0.50:11434"
    enabled: true

The base_url host is automatically added to provider_allowed_hosts, so network policy is handled for you.

Model tiers¶

providers:
  ollama:
    name: ollama
    model: "llama3.2"
    fast_model: "llama3.2:1b"
    premium_model: "llama3.2:70b"

Available models¶

Some popular models that work well with Missy:

Model	Size	Good for
`llama3.2`	3B	Fast, general purpose
`llama3.2:1b`	1B	Very fast, simple queries
`llama3.2:70b`	70B	Complex reasoning (needs large GPU)
`mistral`	7B	Good balance of speed and quality
`codellama`	7B	Code-focused tasks
`deepseek-coder-v2`	16B	Code generation and analysis

Pull any model with:

ollama pull MODEL_NAME

How it works¶

Unlike the Anthropic and OpenAI providers, the Ollama provider does not use a vendor SDK. It communicates directly with the Ollama /api/chat endpoint via Missy's PolicyHTTPClient, which means all requests pass through the network policy engine.

The provider uses stream=false to receive the complete response in a single JSON payload.

Configuration reference¶

Field	Type	Default	Description
`name`	string	`"ollama"`	Must be `"ollama"`
`model`	string	`"llama3.2"`	Model name (must be pulled first)
`base_url`	string	`"http://localhost:11434"`	Ollama server URL
`timeout`	int	`30`	Request timeout in seconds (increase for large models)
`enabled`	bool	`true`	Enable/disable

Increase timeout for large models

Large models like llama3.2:70b can take 30+ seconds for first inference. Set timeout: 120 or higher to avoid timeouts.