Skip to content

Anthropic (Claude) provider

🟢 Stable — nucleusiq-anthropic 0.2.0 (Phase B feature-complete)

nucleusiq-anthropic 0.2.0 ships as Development Status :: 5 - Production/Stable (first stable line; semver applies from here). It connects Claude to NucleusIQ through the Messages API and the official anthropic Python SDK (AsyncAnthropic / Anthropic), with full Phase B feature parity.

  • Requires nucleusiq>=0.7.12 and anthropic>=0.40,<1.
  • 151 unit tests + 6 live integration tests, 95.91% coverage (gate ≥ 95%).
  • All Phase B features are covered by both unit tests and live API tests.

Use this guide as the canonical reference for capabilities, environment, parameters, and operational notes. Copy-paste workflows live in the Anthropic quickstart. Conceptual deep-dives on native server tools, prompt caching, and extended thinking live on their own pages.

What's in 0.2.0 (Phase B)

Capability Status
Messages API (POST /v1/messages) — BaseAnthropic.call / call_streamStreamEvent
@tool / local function tools
Streaming (tokens, THINKING events, tool_call_start / tool_call_end, COMPLETE with metadata)
Structured output (JSON Schema → Messages output_config.format) — tools + format gracefully fall back to function-style tools with a warning
Sampling — temperature, max_output_tokens, top_p (auto-elided when temperature is set on incompatible SKUs) ✅ via framework LLMParams
Errors / retries — SDK exceptions mapped to nucleusiq.llms.errors, Retry-After + capped backoff via shared retry_policy
AnthropicTool native server toolsweb_search(), web_fetch(), code_execution() with dated wire types + auto beta headers New in 0.2.0
Prompt cachingcache_system=True / cache_tools=True emit cache_control: ephemeral blocks on the wire New in 0.2.0
Extended thinkingthinking="low"|"medium"|"high"|"max" or full dict New in 0.2.0
strict_tools=True + disable_parallel_tool_use=True New in 0.2.0
Server-tool observabilityAnthropicLLMResponse.server_tool_calls populated from server_tool_use + per-tool *_tool_result blocks; surfaced as ToolCallRecord(executed_by="provider") by the core agent loop New in 0.2.0
LLMCallRecord enrichmentprovider="anthropic", request_id, organization_id, stop_reason, cache_read_input_tokens, cache_creation_input_tokens New in 0.2.0
Multimodal image messages (HTTP / data URLs) ✅ via translation layer

Not in 0.2.0 (deferred)

  • Anthropic Phase C — Memory tool / computer_use / bash — planned for nucleusiq-anthropic 0.3.x.
  • Bedrock / Vertex / Foundry backends — direct Anthropic API only for the stable line.

Prerequisites

  1. Anthropic Console API key (ANTHROPIC_API_KEY).
  2. A model id your organization can call. Phase B features (native tools, prompt caching, extended thinking) require Claude Sonnet 4 / Opus 4 / 3.7-Sonnet or newer.

Pick a Phase-B-capable model

The Phase B examples and live integration tests default to claude-sonnet-4-5-20250929 because it supports every Phase B feature (web_search, code_execution, cache_*, thinking).

If you see 404 model_not_found, discover the model ids available on your key with examples/agents/09_anthropic_list_models.py and override via ANTHROPIC_PHASE_B_MODEL=<id>.

Installation

pip install nucleusiq nucleusiq-anthropic

Pin the stable line for reproducible builds:

pip install "nucleusiq>=0.7.12" "nucleusiq-anthropic>=0.2.0,<0.3"

Environment

Variable Purpose
ANTHROPIC_API_KEY Required for live calls unless you pass api_key="..." to BaseAnthropic.
ANTHROPIC_MODEL Optional default model id for examples.
ANTHROPIC_PHASE_B_MODEL Optional model id used by the Phase B examples / integration tests (default claude-sonnet-4-5-20250929).
export ANTHROPIC_API_KEY=sk-ant-...
export ANTHROPIC_PHASE_B_MODEL=claude-sonnet-4-5-20250929   # or your accessible Phase B model

Quick start (DIRECT)

Use BaseAnthropic(..., async_mode=True), AgentConfig with LLMParams for sampling, and await agent.initialize() before execute() (matches 01_anthropic_direct.py).

import asyncio
import os

from nucleusiq.agents import Agent
from nucleusiq.agents.config import AgentConfig, ExecutionMode
from nucleusiq.agents.task import Task
from nucleusiq.llms.llm_params import LLMParams
from nucleusiq.prompts.zero_shot import ZeroShotPrompt
from nucleusiq_anthropic import BaseAnthropic


async def main() -> None:
    model = os.getenv("ANTHROPIC_MODEL", "claude-sonnet-4-5-20250929")
    llm = BaseAnthropic(model_name=model, async_mode=True)

    agent = Agent(
        name="anthropic-direct",
        prompt=ZeroShotPrompt().configure(
            system="You are a concise assistant. Reply in one or two short sentences.",
        ),
        llm=llm,
        config=AgentConfig(
            execution_mode=ExecutionMode.DIRECT,
            llm_params=LLMParams(temperature=0.3, max_output_tokens=256),
        ),
    )

    await agent.initialize()
    result = await agent.execute(
        Task(id="anthropic-direct-1", objective="What is the capital of France?"),
    )
    print(result.output)


asyncio.run(main())

Phase B at a glance

Server tools run inside Anthropic's infrastructure — you don't execute them, you just declare them.

from nucleusiq_anthropic import AnthropicTool, BaseAnthropic

llm = BaseAnthropic(model_name="claude-sonnet-4-5-20250929", async_mode=True)

result = await llm.call(
    model="claude-sonnet-4-5-20250929",
    messages=[{"role": "user", "content": "Use code_execution to compute Fib(12)."}],
    tools=[AnthropicTool.code_execution()],
    max_output_tokens=512,
)
print(result.choices[0].message.content)
for stc in result.server_tool_calls:
    print(stc.name, stc.id, stc.result)

→ Full reference on the Native server tools page.

Reuse expensive system prompts across calls — Anthropic serves cached prefixes at a fraction of the cost.

from nucleusiq_anthropic import AnthropicLLMParams, BaseAnthropic

llm = BaseAnthropic(
    model_name="claude-sonnet-4-5-20250929",
    async_mode=True,
    llm_params=AnthropicLLMParams(cache_system=True, cache_tools=True),
)

→ Full reference on the Prompt caching page.

Give Claude a token budget to reason internally before responding.

from nucleusiq_anthropic import AnthropicLLMParams, BaseAnthropic

llm = BaseAnthropic(
    model_name="claude-sonnet-4-5-20250929",
    async_mode=True,
    llm_params=AnthropicLLMParams(thinking="medium"),  # 8 000 token budget
)
# NOTE: max_output_tokens MUST exceed thinking.budget_tokens
# and temperature MUST be 1.0 when thinking is enabled.

→ Full reference on the Extended thinking page.

The framework emits ToolCallRecord(executed_by="provider") for every server-side tool call automatically — no configuration required.

result = await agent.execute(task)
for tc in result.tool_calls:
    print(tc.tool_name, tc.executed_by)
# → web_search       provider
# → code_execution   provider
# → my_local_fn      local

→ Full reference on the Observability guide.

Provider-specific parameters

Pass Claude-only knobs on BaseAnthropic via AnthropicLLMParams (not on AgentConfig):

from nucleusiq_anthropic import AnthropicLLMParams, BaseAnthropic

llm = BaseAnthropic(
    model_name="claude-sonnet-4-5-20250929",
    async_mode=True,
    llm_params=AnthropicLLMParams(
        top_k=40,
        anthropic_beta="user-beta-flag-if-needed",
        extra_headers={"X-Custom": "value"},

        # Phase B
        thinking="medium",                # "low"|"medium"|"high"|"max" or dict
        cache_system=True,                # cache_control on system prompt
        cache_tools=True,                 # cache_control on last tool def
        strict_tools=True,                # strict JSON schema on custom tools
        disable_parallel_tool_use=True,   # one tool call per turn
    ),
)

Keep temperature / max tokens on AgentConfig(llm_params=LLMParams(...)).

Extended thinking constraints

When thinking is enabled Anthropic enforces:

  • temperature must be 1.0 (not 0.0).
  • max_output_tokens must be strictly greater than thinking.budget_tokens (budgets: low=2000, medium=8000, high=32000, max=64000).

A 400 invalid_request_error will mention max_tokens must be greater than thinking.budget_tokens if you violate this.

Structured output

When your Claude model supports native structured outputs, set response_format= on Agent (Pydantic model, dataclass, TypedDict, or schema dict). The adapter maps this to Messages output_config.format and parses JSON into your type.

Tools + structured output

If the agent also has tools, structured output is dropped for that call path with a warning — align with Structured output and test without tools first.

Streaming (call_stream) does not apply response_format; you'll see a warning if it's set.

Public API

from nucleusiq_anthropic import (
    AnthropicLLMParams,         # extended Phase B knobs (thinking, cache_*, strict_tools)
    AnthropicTool,              # factory for native server tools
    BaseAnthropic,              # the LLM wrapper
    NATIVE_TOOL_TYPES,          # frozenset {"web_search","web_fetch","code_execution"}
    NATIVE_TOOL_WIRE_TYPES,     # dated wire identifiers
    NATIVE_TOOL_BETA_HEADERS,   # required anthropic-beta tokens
    ServerToolCall,             # Pydantic model for server-executed tools
    ThinkingEffort,             # Literal["low","medium","high","max"]
    to_anthropic_tool_definition,
)

Runnable examples

Clone NucleusIQ and run from src/providers/llms/anthropic:

uv sync --group full   # or pip install -e . from that directory + core

# Phase A (always-works baseline)
uv run python examples/agents/01_anthropic_direct.py
uv run python examples/agents/03_anthropic_standard_tools.py
uv run python examples/agents/05_anthropic_stream.py
uv run python examples/agents/09_anthropic_list_models.py

# Phase B (require ANTHROPIC_PHASE_B_MODEL = Sonnet 4.5 / Opus 4 etc.)
uv run python examples/agents/10_anthropic_native_tools.py
uv run python examples/agents/11_anthropic_prompt_caching.py
uv run python examples/agents/12_anthropic_extended_thinking.py

Live integration tests

cd src/providers/llms/anthropic
uv run pytest tests/integration -m integration -q
# 6 tests: web_search, code_execution, prompt caching,
# extended thinking (low + medium), disable_parallel_tool_use

Live tests skip cleanly if your ANTHROPIC_API_KEY lacks access to the configured Phase B model.

See also