Testing¶
trails.testing ships fixtures and context managers that every Trails
test suite needs: registry isolation, LLM mocking without API keys,
observability capture, and a Context factory that does not require
invoke(). Importing the module is side-effect free.
Quickstart¶
from trails import capability
from trails.testing import isolated_kernel, mock_llm
def test_summarize_uses_llm():
with isolated_kernel(), mock_llm("short summary") as llm:
@capability(id="note.summarize")
def summarize(ctx, text: str) -> dict:
return {"summary": ctx.llm.complete(
[{"role": "user", "content": text}]
).text}
from trails.runtime import invoke
out = invoke("note.summarize", {"text": "..."})
assert out["summary"] == "short summary"
assert llm.calls # mock recorded the request
isolated_kernel() keeps the handler out of sibling tests' registries.
mock_llm routes ctx.llm through a canned LLMClient — no network
call or API key needed.
isolated_kernel()¶
Context manager that snapshots and clears every module-level registry, restoring the previous state on exit — even on exceptions.
Stashed registries:
trails.decorators._handlers—@capabilityregistrytrails.orm._NODE_TYPES/_NODE_TYPES_BY_NAME—@node_typeregistrytrails.shapes._SHAPES/_SHAPES_BY_NODE_TYPE—@shaperegistry
The kernel Store (trails.runtime._STORE) is not cleared — most
tests need registry isolation, not a fresh graph. When you need a clean
graph, set trails.runtime._STORE = None or use fresh_context().
Recommended conftest.py setup:
import pytest
from trails.testing import isolated_kernel
@pytest.fixture(autouse=True)
def _isolated():
with isolated_kernel():
yield
mock_llm()¶
mock_llm(
response_or_fn: str | dict | Callable[[list], str] | None = None,
*,
fail_first: int = 0,
model: str = "mock:canned",
) -> Iterator[LLMClient]
Installs a mock LLMClient via Context.set_llm_override so every
Context inside the block returns the mock as ctx.llm. Clears the
override on exit.
Three response shapes:
with mock_llm("fixed reply") as llm: # str — returned verbatim
...
with mock_llm({"key": "val"}) as llm: # dict — JSON-serialised
...
def fn(messages): return "dynamic"
with mock_llm(fn) as llm: # callable(messages) -> str
...
Simulating transient errors with fail_first:
with mock_llm("final answer", fail_first=2) as llm:
# First two ctx.llm.complete() calls raise TransientError,
# the third returns "final answer".
...
Inspecting calls: the yielded client exposes .calls for assertions:
Model identifier: defaults to "mock:canned". Override when a test
asserts on LLMResponse.model:
capture_events()¶
Context manager that yields a list collecting every (kind, event)
tuple emitted via trails.observability.emit inside the block. The
observer is unregistered on exit.
from trails.testing import capture_events
from trails.runtime import invoke
with capture_events() as events:
invoke("note.summarize", {"text": "..."})
kinds = [k for k, _ in events]
assert "capability_started" in kinds
assert "capability_completed" in kinds
completed = next(e for k, e in events if k == "capability_completed")
assert completed["capability_id"] == "note.summarize"
Common event kinds: capability_started, capability_completed,
capability_error, llm_request, llm_response.
Event dicts are copied on capture, so mutating them is safe. Events emitted after the block exits are not collected.
fresh_context()¶
Returns a Context bound to the kernel store with a minted UUIDv4
trace_id. Use it when a test needs ctx.kg or Model.find() without
going through invoke():
from trails.testing import fresh_context
ctx = fresh_context()
ctx.kg.add(Note(title="seed"))
hits = Note.where(title="seed").fetch(ctx)
assert len(hits) == 1
Each call returns a new Context — state does not leak between calls.
The backing store is the singleton kernel store, so KG data persists
within a test unless you reset _STORE.
register_principal_role()¶
Thin wrapper around trails.policy.register_principal_attrs for the
common {"role": role} case. Wire the principal before invoking
@policy-protected capabilities:
from trails.testing import register_principal_role
register_principal_role("did:local:test", "admin")
Pytest fixtures¶
Opt in by adding to your conftest.py:
trails_isolated — per-test isolated_kernel() wrapper. Request
it directly or wire as autouse:
trails_mock_llm — yields the installed mock. Accepts indirect
parametrisation:
@pytest.mark.parametrize("trails_mock_llm", ["hi"], indirect=True)
def test_it(trails_mock_llm):
resp = trails_mock_llm.complete([{"role": "user", "content": "?"}])
assert resp.text == "hi"
Without parametrisation the default "canned reply" is used.
Testing patterns¶
Unit testing capabilities¶
from trails import capability
from trails.testing import isolated_kernel, mock_llm
def test_greet():
with isolated_kernel():
@capability
def greet(name: str) -> dict:
return {"greeting": f"hi {name}"}
import trails
env = trails.invoke("greet", {"name": "world"})
assert env["payload"] == {"greeting": "hi world"}
Testing middleware¶
Test @before, @after, @on_error hooks by registering both the
capability and middleware inside isolated_kernel():
import trails
from trails.decorators import before
from trails.testing import isolated_kernel
def test_before_hook_transforms_args():
with isolated_kernel():
@trails.capability
def greet(name: str) -> dict:
return {"greeting": f"hi {name}"}
@before("greet")
def _upcase(ctx, args):
return {"name": args["name"].upper()}
env = trails.invoke("greet", {"name": "world"})
assert env["payload"] == {"greeting": "hi WORLD"}
Testing policies¶
Use register_principal_role() to wire roles, then evaluate:
from trails.testing import isolated_kernel, register_principal_role
from trails.policy import evaluate_policies, PolicyContext
def test_admin_can_delete():
with isolated_kernel():
register_principal_role("did:local:test", "admin")
ctx = PolicyContext(
principal="did:local:test",
action="record.delete",
resource={}, environment={},
)
decisions = evaluate_policies([allow_admin_policy], ctx)
assert all(d.effect == "permit" for d in decisions)
Testing ORM models¶
from trails import node_type
from trails.testing import fresh_context, isolated_kernel
def test_model_crud():
with isolated_kernel():
@node_type("Task", fields={"title": str, "done": bool})
class Task: ...
ctx = fresh_context()
task = Task(title="Write tests", done=False)
ctx.kg.add(task)
found = Task.where(done=False).fetch(ctx)
assert len(found) >= 1
assert found[0].title == "Write tests"
Integration tests with ctx.kg¶
For a clean graph, reset _STORE before the test. Combine with
mock_llm() when the capability touches both LLM and KG:
import trails.runtime
from trails import capability, node_type
from trails.testing import fresh_context, isolated_kernel, mock_llm
def test_capability_writes_to_kg():
with isolated_kernel(), mock_llm("Generated text"):
trails.runtime._STORE = None # clean graph
@node_type("Article", fields={"body": str})
class Article: ...
@capability(id="article.generate")
def generate(ctx, topic: str) -> dict:
body = ctx.llm.complete(
[{"role": "user", "content": topic}]
).text
a = Article(body=body)
ctx.kg.add(a)
return {"id": a.id}
from trails.runtime import invoke
invoke("article.generate", {"topic": "testing"})
ctx = fresh_context()
articles = Article.where(body="Generated text").fetch(ctx)
assert len(articles) == 1
Complete conftest.py template¶
A recommended starting point for any Trails app test suite:
"""conftest.py — Trails app test setup."""
import pytest
from trails.testing import isolated_kernel, mock_llm, fresh_context
pytest_plugins = ["trails.testing"]
@pytest.fixture(autouse=True)
def _isolated():
"""Every test gets a clean registry."""
with isolated_kernel():
yield
@pytest.fixture
def ctx():
"""A ready-to-use Context for KG operations."""
return fresh_context()
@pytest.fixture
def llm():
"""A mock LLM returning a default response."""
with mock_llm("test response") as client:
yield client
End-to-end example¶
A capability that calls an LLM and writes the result to the KG, tested with isolation, mocking, and event capture:
from trails import capability, node_type
from trails.runtime import invoke
from trails.testing import capture_events, isolated_kernel, mock_llm
def test_summarize_writes_note_and_emits_completion():
with isolated_kernel(), mock_llm("TL;DR: fine") as llm:
@node_type("Summary", fields={"text": str})
class Summary: ...
@capability(id="note.summarize")
def summarize(ctx, text: str) -> dict:
reply = ctx.llm.complete(
[{"role": "user", "content": text}]
).text
s = Summary(text=reply)
ctx.kg.add(s)
return {"id": s.id, "summary": reply}
with capture_events() as events:
out = invoke("note.summarize", {"text": "..."})
assert out["summary"] == "TL;DR: fine"
assert llm.calls, "mock was not invoked"
assert any(k == "capability_completed" for k, _ in events)
API reference¶
| Symbol | Kind | Purpose |
|---|---|---|
isolated_kernel() |
context manager | Snapshot + clear + restore _handlers, _NODE_TYPES, _SHAPES registries |
mock_llm(response_or_fn, *, fail_first=0, model="mock:canned") |
context manager | Install mock LLMClient via Context.set_llm_override; yields client |
capture_events() |
context manager | Collect every (kind, event) tuple emitted inside the block; yields list |
fresh_context(principal="did:local:test", trace_id=None) |
factory | Build Context bound to kernel store for direct ctx.kg / Model.find |
register_principal_role(principal, role) |
function | Set {"role": role} on principal attrs for Cedar policy eval |
trails_isolated |
pytest fixture | Per-test isolated_kernel(). Requires pytest_plugins = ["trails.testing"] |
trails_mock_llm |
pytest fixture | Per-test mock_llm() with indirect parametrisation support |
End-to-end test suite and traceability matrix¶
The python/tests/e2e/ directory contains cross-cutting E2E tests that
exercise full request lifecycles across subsystems (ORM, capabilities,
agents, ingestion, RML, federation, auto-ontology). Each test is tagged
with requirement IDs via @pytest.mark.traces("REQ-ID").
Run the suite with traceability output:
This generates two artifacts:
python/tests/e2e/traceability-matrix.html— visual coverage report showing which requirements are covered, test status, duration, and exercised subsystems.python/tests/e2e/traceability-matrix.json— machine-readable equivalent for CI integration.
Requirements are declared in python/tests/e2e/requirements.yaml.
Tests without a traces marker still run but do not appear in the
matrix. The traceability plugin (python/tests/e2e/traceability.py)
is a standard pytest plugin — add it to conftest.py or use
pytest_plugins = ["tests.e2e.traceability"].
See also: ORM · Agent Runtime · LLM Client & Session · Policy & Authorization.