Skip to content

LLM Client

Thin LLM client surface with four provider backends and a shared message shape.

Auto-generated docs

When trails is installed, run ENABLE_MKDOCSTRINGS=true ./scripts/docs-build for full docstring-extracted reference.

Factory methods

Symbol Signature Description
LLMClient.anthropic .anthropic(*, model: str = "claude-sonnet-4-6", api_key: str \| None = None, base_url: str \| None = None, cache: bool = False, cache_ttl: str \| None = None, retry: RetryPolicy \| None = None, timeout: float = 30.0) -> LLMClient Build an Anthropic-backed client
LLMClient.openai .openai(*, model: str = "gpt-4.1-mini", api_key: str \| None = None, base_url: str \| None = None, cache: bool = False, retry: RetryPolicy \| None = None, timeout: float = 30.0) -> LLMClient Build an OpenAI-backed client
LLMClient.ollama .ollama(*, model: str = "qwen3:8b", base_url: str = "http://localhost:11434", retry: RetryPolicy \| None = None, timeout: float = 30.0) -> LLMClient Build an Ollama-backed client (stdlib HTTP, no dependencies)
LLMClient.mock .mock(*, model: str = "mock:canned", response: str \| Callable = "canned reply", usage: LLMUsage \| None = None, cost_usd: float = 0.0, stop_reason: str \| None = "end_turn", fail_first: int = 0) -> LLMClient Deterministic mock client for tests
LLMClient.from_config .from_config(cfg: LLMConfig \| TrailsConfig) -> LLMClient Build a client from a trails.toml config section

Completion

Symbol Signature Description
.complete .complete(messages, *, max_tokens: int = 1024, temperature: float = 0.0, stop: list[str] \| None = None, ctx: Context \| None = None, task_budget: TaskBudget \| None = None, effort: str \| None = None, thinking: ThinkingConfig \| None = None, response_format: dict \| str \| None = None) -> LLMResponse Run a single completion with optional cost tracking and PROV emission
.complete_structured .complete_structured(messages, *, shape_or_schema: str \| dict \| type, max_tokens: int = 1024, temperature: float = 0.0, ctx: Context \| None = None) -> LLMResponse Complete with structured output constrained to a @shape, @node_type, or JSON Schema
.batch .batch(requests: list[BatchRequest], *, ctx: Context \| None = None) -> list[BatchResult] Process multiple completions as a batch

Data types

Symbol Signature Description
Message Message(role: str, content: str, cache: bool = False) One chat message ("system", "user", or "assistant")
LLMResponse LLMResponse(text, usage, model, stop_reason, cost_usd, cache_hit, ...) Normalized response across providers
LLMUsage LLMUsage(prompt_tokens, completion_tokens, total_tokens, cache_creation_tokens, cache_read_tokens) Token usage breakdown
BatchRequest BatchRequest(custom_id: str, messages: list[Message], max_tokens: int = 1024, temperature: float = 0.0) One item in a batch
BatchResult BatchResult(custom_id: str, response: LLMResponse \| None = None, error: str \| None = None) Result for one batch item
RetryPolicy RetryPolicy(max_retries: int = 3, base_delay: float = 0.5, max_delay: float = 8.0, jitter: bool = True) Exponential backoff configuration
TaskBudget TaskBudget(total: int, remaining: int \| None = None) Advisory token budget for agentic loops (Anthropic only)
ThinkingConfig ThinkingConfig.adaptive() \| .enabled(budget_tokens) \| .disabled() Extended / adaptive thinking configuration (Anthropic only)