Observability¶
Trails exposes a zero-dependency event hook plus an in-memory tracer
and metrics store. The hook fires on every @capability invocation, on
every LLMClient.complete() call, and on every ctx.kg write or query
— so an application can wire OpenTelemetry, Prometheus, or a custom
sink without Trails taking a hard dependency on any of them.
Commits of record:
622d533
(capability + LLM events) and
bbbac1b
(KG events). Cost-side mirror is
ADR-0012; runtime-side
integration is ADR-0018.
Quickstart¶
from trails import capability, invoke
from trails.observability import register_observer
def log(kind, event):
print(kind, event)
register_observer(log)
@capability
def greet(name: str) -> dict:
return {"msg": f"hello {name}"}
invoke("greet", {"name": "ada"})
# capability_started {'capability_id': 'greet', 'args_keys': ['name'], ...}
# capability_completed {'capability_id': 'greet', 'outcome': 'success', ...}
Observers run in registration order, receive the same dict, and never affect the invoke path: exceptions are caught and logged.
Event kinds¶
Six kinds are emitted from the runtime today. Every event is a plain
dict — pick the fields you need.
capability_started¶
Fired before the handler runs (and before any @policy check).
Fields: capability_id, args_keys (sorted list of top-level arg
names — values are NOT leaked), trace_id, principal, started_at
(time.monotonic() snapshot).
capability_completed¶
Fired after provenance is attached, on success only.
Fields: capability_id, trace_id, duration_ms, outcome="success".
capability_failed¶
Fired whenever the handler raises, whenever @policy denies, or when
the return value is not JSON-serializable.
Fields: capability_id, trace_id, duration_ms, outcome="failed",
error_kind (the exception class name, e.g. ValueError,
PermissionError), message (str(exc)).
llm_call¶
Fired from LLMClient.complete() after retry resolution, independent
of whether a Context was passed. Best-effort — an emit failure only
logs a warning.
Fields: model, provider ("anthropic" | "ollama" | "mock"),
tokens (total prompt + completion), cost_usd, latency_ms.
kg_write¶
Fired from ctx.kg.add, ctx.kg.save, and ctx.kg.update.
Common fields: trace_id, principal, duration_ms. The
add / save paths add op ("add" or "save") and model (the
@node_type class name); save additionally carries dirty_fields
(the list of attributes that changed since the last flush). The raw
SPARQL escape hatch (ctx.kg.update) instead carries
sparql_kind="update" — no op, no model.
kg_query¶
Fired from ctx.kg.query and ctx.kg.match.
Common fields: trace_id, principal, row_count, duration_ms.
The SPARQL path adds sparql_kind — one of "select", "ask",
"construct" (detected by first non-prefix token). The match-variant
adds op="match", labels (list of strings), and types (class
names stringified — class objects are never leaked to observers).
API reference¶
from trails.observability import (
Observer, register_observer, unregister_observer,
clear_observers, emit,
)
register_observer(callback)— appendscallbackto the global observer list. Registering the same callable twice fires it twice; each registration must be removed independently.unregister_observer(callback)— removes the first registration. Silent no-op when not registered (safe infinallyblocks).clear_observers()— strips every registration. Intended for tests.emit(kind, **fields)— fires an event to every observer. Takes a snapshot under lock, then releases the lock before invoking callbacks, so a slow observer does not block concurrent emitters. Zero observers → cheapreturn.
Thread-safety: registration, unregistration, and the snapshot inside
emit are all guarded by a module-level threading.Lock. Observer
callbacks run outside that lock, so an observer may itself emit
events or (un)register observers without self-deadlocking — but the
changes apply to the next emit, not the in-flight one.
OpenTelemetry bridge¶
The event hook is designed to map cleanly onto OTel spans. The minimal bridge below turns every capability invocation into a span and records LLM-call spans as events on that span:
from opentelemetry import trace
from trails.observability import register_observer
otel = trace.get_tracer("my-app")
spans: dict[str, object] = {}
def to_otel(kind, event):
if kind == "capability_started":
spans[event["trace_id"]] = otel.start_span(event["capability_id"])
elif kind in ("capability_completed", "capability_failed"):
s = spans.pop(event["trace_id"], None)
if s is not None:
s.end()
register_observer(to_otel)
For direct in-process tracing without OTel, the module also ships a
TrailsTracer and a module-level tracer singleton. It stores spans
in memory, logs each one as a JSON line on the trails.trace logger,
and supports nesting:
from trails.observability import tracer
with tracer.span("my-op", attributes={"version": 2}) as span:
... # nested tracer.span() calls inherit span.trace_id
tracer.list_spans() / tracer.get_spans_by_trace(trace_id) are the
query API used by the trails CLI. TrailsTracer.__init__ accepts
an otlp_endpoint argument that is reserved for future OTLP export;
until then it is a no-op and spans go to the logger.
Cost accounting bridge¶
llm_call events and the CostTracker surface record the same data
twice, on purpose:
llm_callevents are the thin, user-wirable side. They flow through any observer (OTel, Prometheus, stdout) and fire even when noContextis threaded through.CostTracker(ADR-0012) is the framework-internal budget primitive: one envelope per call, nested correctly viacall_id/parent_call_id(commitd2cd31a), enforceable at capability / principal / tenant scope.
Use observers for metrics export, dashboards, and custom alerting;
use CostTracker when you need to enforce budget before the next
call runs.
Best practices¶
- Observers are best-effort. An exception in your callback is
caught and logged on
trails.observability, never re-raised. Treat observers as fire-and-forget. - Do not block. The emit path runs synchronously on the invoke thread. If your sink is remote (Datadog, Honeycomb, OTLP HTTP), queue the event and flush on a background thread. Synchronous observers must stay microseconds-cheap.
- Observers compose. Register as many as you need; every one
receives every event. Call
clear_observers()between tests to avoid leakage between cases — seetrails.testing.capture_eventsfor a context-manager helper that registers + unregisters an in-memory collector. - Zero observers = zero cost.
emitexits early when the list is empty; decorator-free handlers pay no observability overhead.
Reference table¶
Every public symbol exported by trails.observability:
| Symbol | Kind | Purpose |
|---|---|---|
EventKind |
Literal type alias |
Enumerates the six emitted kinds |
Observer |
Callable[[str, dict], None] alias |
Callback signature |
Span |
dataclass | Single trace span (trace_id, span_id, name, attributes, start_time, end_time, status, parent_id) |
TrailsTracer |
class | In-memory tracer; start_span, end_span, span() ctx manager, list_spans, get_spans_by_trace, clear |
TrailsMetrics |
class | In-memory counters + latency histograms; record_invocation, record_error, get_summary, clear |
tracer |
TrailsTracer singleton |
Process-wide tracer used by the CLI |
metrics |
TrailsMetrics singleton |
Process-wide metrics registry |
register_observer(cb) |
function | Append an event observer |
unregister_observer(cb) |
function | Remove one registration (silent no-op if absent) |
clear_observers() |
function | Drop every registration (tests) |
emit(kind, **fields) |
function | Fire an event to every observer |
See also: LLM Client & Session for how llm_call is wired,
and Testing for capture_events — the canonical
in-test observer.