Ollama provider
Run Ollama models (local daemon or a reachable HTTP API) through NucleusIQ using the official ollama Python SDK β no LangChain.
π’ Stable β nucleusiq-ollama 0.2.0
nucleusiq-ollama 0.2.0 ships as Development Status :: 5 - Production/Stable (first stable line). Requires nucleusiq>=0.7.12. 98 unit tests, 99.85% coverage.
What you get in 0.2.0
| Capability | Supported |
|---|---|
Chat (/api/chat) |
β |
Streaming (StreamEvent, tokens + metadata) |
β |
@tool / function tools |
β |
Structured output (format / JSON schema) |
β
β combining response_format with tools drops format with a warning (same caution pattern as Groq) |
think (reasoning / THINKING stream events) |
β |
| Vision (image messages) | β New in 0.2.0 |
LLMCallRecord.provider="ollama" enrichment |
β New in 0.2.0 |
| Embeddings | Out of scope for this stable line. |
Prerequisites
- Ollama installed and a model pulled, e.g.
ollama pull llama3.2(names depend on your catalog). - Daemon reachable at
OLLAMA_HOSTor the SDK default (http://127.0.0.1:11434when unset).
Installation
pip install nucleusiq nucleusiq-ollama
Pin the stable line for reproducible builds:
pip install "nucleusiq>=0.7.12" "nucleusiq-ollama>=0.2.0,<0.3"
Dependency: ollama>=0.5.0,<1.0.
Environment
| Variable | Purpose |
|---|---|
OLLAMA_HOST |
Passed as host= to the ollama AsyncClient / Client (omit for local default). |
OLLAMA_API_KEY |
Optional Bearer token for hosted / authenticated endpoints. |
OLLAMA_MODEL |
Optional default model id (examples often use llama3.2). |
# export OLLAMA_HOST=http://127.0.0.1:11434
# export OLLAMA_API_KEY=... # only if your endpoint requires it
export OLLAMA_MODEL=llama3.2
Quick start (Direct)
Use BaseOllama with async_mode=True. Call await agent.initialize() before execute() (matches monorepo examples).
import asyncio
from nucleusiq.agents import Agent
from nucleusiq.agents.config import AgentConfig, ExecutionMode
from nucleusiq.agents.task import Task
from nucleusiq.prompts.zero_shot import ZeroShotPrompt
from nucleusiq_ollama import BaseOllama, OllamaLLMParams
async def main() -> None:
llm = BaseOllama(model_name="llama3.2", async_mode=True)
agent = Agent(
name="ollama-demo",
prompt=ZeroShotPrompt().configure(
system="You are a concise assistant.",
),
llm=llm,
config=AgentConfig(
execution_mode=ExecutionMode.DIRECT,
llm_params=OllamaLLMParams(temperature=0.3, max_output_tokens=256),
),
)
await agent.initialize()
result = await agent.execute(
Task(id="ollama-1", objective="What is the capital of France?"),
)
print(result.output)
asyncio.run(main())
Tools (Standard / Autonomous)
Ollama accepts OpenAI-style function tools via to_ollama_function_tool. From the agentβs perspective this is the same @tool workflow as other providers.
from nucleusiq.agents import Agent
from nucleusiq.agents.config import AgentConfig, ExecutionMode
from nucleusiq.prompts.zero_shot import ZeroShotPrompt
from nucleusiq.tools.decorators import tool
from nucleusiq_ollama import BaseOllama, OllamaLLMParams
@tool
def add(a: int, b: int) -> str:
"""Add two integers."""
return str(a + b)
llm = BaseOllama(model_name="llama3.2", async_mode=True)
agent = Agent(
name="ollama-tools",
prompt=ZeroShotPrompt().configure(system="Use tools for arithmetic."),
llm=llm,
tools=[add],
config=AgentConfig(
execution_mode=ExecutionMode.STANDARD,
llm_params=OllamaLLMParams(temperature=0.4, max_output_tokens=512),
),
)
There are no Ollama "native server tools" like Gemini's GoogleTool β local @tool only.
Vision (image messages) β new in 0.2.0
The _shared/wire.py sanitize_messages helper now splits OpenAI-style multimodal content lists into Ollama's chat-message shape: text parts become the content string, and image_url parts whose URL is a data:image/*;base64,β¦ URL are decoded into Ollama's images field.
messages = [
{
"role": "user",
"content": [
{"type": "text", "text": "What's in this image?"},
{
"type": "image_url",
"image_url": {"url": "data:image/png;base64,iVBORw0K..."},
},
],
}
]
messages = [
{
"role": "user",
"content": "What's in this image?",
"images": ["iVBORw0K..."], # raw base64 strings
}
]
messages = [
{
"role": "user",
"content": [
{"type": "text", "text": "Describe this:"},
{"type": "image", "data": "iVBORw0K..."}, # raw base64
],
}
]
HTTP image URLs are skipped
Anything that isn't a data: URL (e.g. https://example.com/cat.png) triggers a warning and is omitted from the request β NucleusIQ does not fetch remote images on your behalf. Encode the image client-side as data:image/...;base64,... before sending.
Multi-modal model required
Vision requires a multimodal Ollama model β for example llama3.2-vision, llava, or bakllava. Pull it first:
ollama pull llama3.2-vision
OllamaLLMParams
Extends LLMParams with extra="forbid".
| Field | Meaning |
|---|---|
think |
bool or "low" / "medium" / "high" β maps to Ollama think; streams THINKING events when enabled. |
keep_alive |
Ollama model keep-alive duration (float, str, or None). |
Framework fields such as temperature, max_output_tokens (β Ollama num_predict), top_p, penalties, stop, seed are merged into Ollama options β see the design doc.
from nucleusiq.agents.config import AgentConfig
from nucleusiq_ollama import OllamaLLMParams
config = AgentConfig(
llm_params=OllamaLLMParams(
temperature=0.2,
max_output_tokens=512,
think="medium",
keep_alive="5m",
),
)
Structured output
With nucleusiq>=0.7.10, the core structured_output resolver recognizes BaseOllama so Agent(..., response_format=MyModel) uses the correct provider payload. Model and schema support still depend on your Ollama model and server version β validate on your stack.
Tools + schema
If the agent has tools, structured format is dropped for safety (logged). Prefer a tools-only pass or a separate execute without tools when you need strict JSON.
Streaming
Use agent.execute_stream(...) like other providers; the adapter emits StreamEvent tokens (and THINKING when think is enabled).
Runnable examples (monorepo)
From src/providers/inference/ollama after uv sync:
uv run python examples/agents/00_ollama_smoke.py
uv run python examples/agents/01_ollama_direct.py
uv run python examples/agents/02_ollama_stream_live.py
uv run python examples/agents/03_ollama_capabilities_matrix.py
03_ollama_capabilities_matrix.py β chat, stream, structured output, and thinking Γ DIRECT / STANDARD / AUTONOMOUS (filter with --only).
Package README: src/providers/inference/ollama/README.md.
See also
- Ollama quickstart β Copy-paste gears
- Providers β Portability
- Models β Parameter tabs
- Installation β
nucleusiq[http]optional extra (v0.7.10) - Structured output β Framework patterns