Skip to content

ADR-0071: End-to-End OTel Trace Context Propagation

  • Status: Accepted
  • Date: 2026-05-02
  • Supersedes:
  • Superseded by:
  • Related: ADR-0009 (Provenance Always On), ADR-0012 (Cost Primitive)

Context

Trails has an observability module (trails.observability) that fires events at individual points — capability start/complete/fail and LLM calls — and an OTLPExporter that converts those events into OpenTelemetry spans. However, there is no trace context propagation:

  1. No parent-child span relationships. A capability that calls an LLM and then writes to the KG produces three independent spans with no causal link.
  2. No cross-boundary propagation. Federation HTTP calls carry no W3C Traceparent header, so the receiving peer starts a new trace instead of continuing the caller's trace.
  3. No request-scoped context. There is no contextvars-based carrier that threads a trace ID from the HTTP entry point through policy evaluation, capability dispatch, LLM calls, and provenance recording.

Without propagation, distributed traces are fragmented and unusable for debugging multi-hop agentic workflows.

Decision

Introduce a pure-stdlib trails.trace_context module that provides:

  • TraceContext dataclass — carries trace_id (32-hex), span_id (16-hex), optional parent_span_id, sampling flag, and baggage. Supports child(), to_traceparent() (W3C format 00-{trace_id}-{span_id}-{flags}), and from_traceparent().
  • contextvars propagation_current_trace ContextVar with current_trace() / set_trace() accessors.
  • traced_span context manager — creates a child span from the current context, fires span_started / span_ended events to the observer system, and restores the parent on exit.
  • Middleware functions for capability invocation, LLM calls, federation header inject/extract, and PROV-O annotation — all opt-in, no modification of existing source files required.
  • TracingPipelineStage — a pipeline stage that creates a root trace context (or extracts one from incoming headers) and sets it as the current context for the request.

Design constraints

  • No hard dependency on opentelemetry-sdk. The module uses only stdlib (contextvars, secrets, dataclasses, re, time, logging, functools).
  • Observer integration only. Events flow through the existing trails.observability.emit() mechanism so any registered observer (including OTLPExporter) automatically receives span lifecycle events.
  • Opt-in middleware. Existing modules are not modified; trace propagation is wired by wrapping functions or inserting the pipeline stage.

Consequences

  • Distributed traces across federation peers become end-to-end visible.
  • PROV-O activities gain trails:traceId and trails:spanId predicates, linking provenance to traces.
  • Future work: automatic middleware wiring via trails.toml config ([tracing] auto_instrument = true).