LLM Client¶
Thin LLM client surface with four provider backends and a shared message shape.
Auto-generated docs
When trails is installed, run ENABLE_MKDOCSTRINGS=true ./scripts/docs-build
for full docstring-extracted reference.
Factory methods¶
| Symbol | Signature | Description |
|---|---|---|
LLMClient.anthropic |
.anthropic(*, model: str = "claude-sonnet-4-6", api_key: str \| None = None, base_url: str \| None = None, cache: bool = False, cache_ttl: str \| None = None, retry: RetryPolicy \| None = None, timeout: float = 30.0) -> LLMClient |
Build an Anthropic-backed client |
LLMClient.openai |
.openai(*, model: str = "gpt-4.1-mini", api_key: str \| None = None, base_url: str \| None = None, cache: bool = False, retry: RetryPolicy \| None = None, timeout: float = 30.0) -> LLMClient |
Build an OpenAI-backed client |
LLMClient.ollama |
.ollama(*, model: str = "qwen3:8b", base_url: str = "http://localhost:11434", retry: RetryPolicy \| None = None, timeout: float = 30.0) -> LLMClient |
Build an Ollama-backed client (stdlib HTTP, no dependencies) |
LLMClient.mock |
.mock(*, model: str = "mock:canned", response: str \| Callable = "canned reply", usage: LLMUsage \| None = None, cost_usd: float = 0.0, stop_reason: str \| None = "end_turn", fail_first: int = 0) -> LLMClient |
Deterministic mock client for tests |
LLMClient.from_config |
.from_config(cfg: LLMConfig \| TrailsConfig) -> LLMClient |
Build a client from a trails.toml config section |
Completion¶
| Symbol | Signature | Description |
|---|---|---|
.complete |
.complete(messages, *, max_tokens: int = 1024, temperature: float = 0.0, stop: list[str] \| None = None, ctx: Context \| None = None, task_budget: TaskBudget \| None = None, effort: str \| None = None, thinking: ThinkingConfig \| None = None, response_format: dict \| str \| None = None) -> LLMResponse |
Run a single completion with optional cost tracking and PROV emission |
.complete_structured |
.complete_structured(messages, *, shape_or_schema: str \| dict \| type, max_tokens: int = 1024, temperature: float = 0.0, ctx: Context \| None = None) -> LLMResponse |
Complete with structured output constrained to a @shape, @node_type, or JSON Schema |
.batch |
.batch(requests: list[BatchRequest], *, ctx: Context \| None = None) -> list[BatchResult] |
Process multiple completions as a batch |
Data types¶
| Symbol | Signature | Description |
|---|---|---|
Message |
Message(role: str, content: str, cache: bool = False) |
One chat message ("system", "user", or "assistant") |
LLMResponse |
LLMResponse(text, usage, model, stop_reason, cost_usd, cache_hit, ...) |
Normalized response across providers |
LLMUsage |
LLMUsage(prompt_tokens, completion_tokens, total_tokens, cache_creation_tokens, cache_read_tokens) |
Token usage breakdown |
BatchRequest |
BatchRequest(custom_id: str, messages: list[Message], max_tokens: int = 1024, temperature: float = 0.0) |
One item in a batch |
BatchResult |
BatchResult(custom_id: str, response: LLMResponse \| None = None, error: str \| None = None) |
Result for one batch item |
RetryPolicy |
RetryPolicy(max_retries: int = 3, base_delay: float = 0.5, max_delay: float = 8.0, jitter: bool = True) |
Exponential backoff configuration |
TaskBudget |
TaskBudget(total: int, remaining: int \| None = None) |
Advisory token budget for agentic loops (Anthropic only) |
ThinkingConfig |
ThinkingConfig.adaptive() \| .enabled(budget_tokens) \| .disabled() |
Extended / adaptive thinking configuration (Anthropic only) |