Testing¶

trails.testing ships fixtures and context managers that every Trails test suite needs: registry isolation, LLM mocking without API keys, observability capture, and a Context factory that does not require invoke(). Importing the module is side-effect free.

Quickstart¶

from trails import capability
from trails.testing import isolated_kernel, mock_llm

def test_summarize_uses_llm():
    with isolated_kernel(), mock_llm("short summary") as llm:
        @capability(id="note.summarize")
        def summarize(ctx, text: str) -> dict:
            return {"summary": ctx.llm.complete(
                [{"role": "user", "content": text}]
            ).text}

        from trails.runtime import invoke
        out = invoke("note.summarize", {"text": "..."})
        assert out["summary"] == "short summary"
        assert llm.calls  # mock recorded the request

isolated_kernel() keeps the handler out of sibling tests' registries. mock_llm routes ctx.llm through a canned LLMClient — no network call or API key needed.

`isolated_kernel()`¶

Context manager that snapshots and clears every module-level registry, restoring the previous state on exit — even on exceptions.

Stashed registries:

trails.decorators._handlers — @capability registry
trails.orm._NODE_TYPES / _NODE_TYPES_BY_NAME — @node_type registry
trails.shapes._SHAPES / _SHAPES_BY_NODE_TYPE — @shape registry

The kernel Store (trails.runtime._STORE) is not cleared — most tests need registry isolation, not a fresh graph. When you need a clean graph, set trails.runtime._STORE = None or use fresh_context().

Recommended conftest.py setup:

import pytest
from trails.testing import isolated_kernel

@pytest.fixture(autouse=True)
def _isolated():
    with isolated_kernel():
        yield

`mock_llm()`¶

mock_llm(
    response_or_fn: str | dict | Callable[[list], str] | None = None,
    *,
    fail_first: int = 0,
    model: str = "mock:canned",
) -> Iterator[LLMClient]

Installs a mock LLMClient via Context.set_llm_override so every Context inside the block returns the mock as ctx.llm. Clears the override on exit.

Three response shapes:

with mock_llm("fixed reply") as llm:       # str — returned verbatim
    ...
with mock_llm({"key": "val"}) as llm:      # dict — JSON-serialised
    ...
def fn(messages): return "dynamic"
with mock_llm(fn) as llm:                  # callable(messages) -> str
    ...

Simulating transient errors with fail_first:

with mock_llm("final answer", fail_first=2) as llm:
    # First two ctx.llm.complete() calls raise TransientError,
    # the third returns "final answer".
    ...

Inspecting calls: the yielded client exposes .calls for assertions:

with mock_llm("ok") as llm:
    invoke("my-cap", {"text": "hello"})
    assert len(llm.calls) == 1

Model identifier: defaults to "mock:canned". Override when a test asserts on LLMResponse.model:

with mock_llm("ok", model="test-model") as llm:
    assert llm.complete([...]).model == "test-model"

`capture_events()`¶

Context manager that yields a list collecting every (kind, event) tuple emitted via trails.observability.emit inside the block. The observer is unregistered on exit.

from trails.testing import capture_events
from trails.runtime import invoke

with capture_events() as events:
    invoke("note.summarize", {"text": "..."})

kinds = [k for k, _ in events]
assert "capability_started" in kinds
assert "capability_completed" in kinds

completed = next(e for k, e in events if k == "capability_completed")
assert completed["capability_id"] == "note.summarize"

Common event kinds: capability_started, capability_completed, capability_error, llm_request, llm_response.

Event dicts are copied on capture, so mutating them is safe. Events emitted after the block exits are not collected.

`fresh_context()`¶

fresh_context(
    principal: str = "did:local:test",
    trace_id: str | None = None,
) -> Context

Returns a Context bound to the kernel store with a minted UUIDv4 trace_id. Use it when a test needs ctx.kg or Model.find() without going through invoke():

from trails.testing import fresh_context

ctx = fresh_context()
ctx.kg.add(Note(title="seed"))
hits = Note.where(title="seed").fetch(ctx)
assert len(hits) == 1

Each call returns a new Context — state does not leak between calls. The backing store is the singleton kernel store, so KG data persists within a test unless you reset _STORE.

`register_principal_role()`¶

register_principal_role(principal: str, role: str) -> None

Thin wrapper around trails.policy.register_principal_attrs for the common {"role": role} case. Wire the principal before invoking @policy-protected capabilities:

from trails.testing import register_principal_role

register_principal_role("did:local:test", "admin")

Pytest fixtures¶

Opt in by adding to your conftest.py:

pytest_plugins = ["trails.testing"]

trails_isolated — per-test isolated_kernel() wrapper. Request it directly or wire as autouse:

@pytest.fixture(autouse=True)
def _auto_isolated(trails_isolated):
    yield

trails_mock_llm — yields the installed mock. Accepts indirect parametrisation:

@pytest.mark.parametrize("trails_mock_llm", ["hi"], indirect=True)
def test_it(trails_mock_llm):
    resp = trails_mock_llm.complete([{"role": "user", "content": "?"}])
    assert resp.text == "hi"

Without parametrisation the default "canned reply" is used.

Testing patterns¶

Unit testing capabilities¶

from trails import capability
from trails.testing import isolated_kernel, mock_llm

def test_greet():
    with isolated_kernel():
        @capability
        def greet(name: str) -> dict:
            return {"greeting": f"hi {name}"}

        import trails
        env = trails.invoke("greet", {"name": "world"})
        assert env["payload"] == {"greeting": "hi world"}

Testing middleware¶

Test @before, @after, @on_error hooks by registering both the capability and middleware inside isolated_kernel():

import trails
from trails.decorators import before
from trails.testing import isolated_kernel

def test_before_hook_transforms_args():
    with isolated_kernel():
        @trails.capability
        def greet(name: str) -> dict:
            return {"greeting": f"hi {name}"}

        @before("greet")
        def _upcase(ctx, args):
            return {"name": args["name"].upper()}

        env = trails.invoke("greet", {"name": "world"})
        assert env["payload"] == {"greeting": "hi WORLD"}

Testing policies¶

Use register_principal_role() to wire roles, then evaluate:

from trails.testing import isolated_kernel, register_principal_role
from trails.policy import evaluate_policies, PolicyContext

def test_admin_can_delete():
    with isolated_kernel():
        register_principal_role("did:local:test", "admin")
        ctx = PolicyContext(
            principal="did:local:test",
            action="record.delete",
            resource={}, environment={},
        )
        decisions = evaluate_policies([allow_admin_policy], ctx)
        assert all(d.effect == "permit" for d in decisions)

Testing ORM models¶

from trails import node_type
from trails.testing import fresh_context, isolated_kernel

def test_model_crud():
    with isolated_kernel():
        @node_type("Task", fields={"title": str, "done": bool})
        class Task: ...

        ctx = fresh_context()
        task = Task(title="Write tests", done=False)
        ctx.kg.add(task)

        found = Task.where(done=False).fetch(ctx)
        assert len(found) >= 1
        assert found[0].title == "Write tests"

Integration tests with `ctx.kg`¶

For a clean graph, reset _STORE before the test. Combine with mock_llm() when the capability touches both LLM and KG:

import trails.runtime
from trails import capability, node_type
from trails.testing import fresh_context, isolated_kernel, mock_llm

def test_capability_writes_to_kg():
    with isolated_kernel(), mock_llm("Generated text"):
        trails.runtime._STORE = None  # clean graph

        @node_type("Article", fields={"body": str})
        class Article: ...

        @capability(id="article.generate")
        def generate(ctx, topic: str) -> dict:
            body = ctx.llm.complete(
                [{"role": "user", "content": topic}]
            ).text
            a = Article(body=body)
            ctx.kg.add(a)
            return {"id": a.id}

        from trails.runtime import invoke
        invoke("article.generate", {"topic": "testing"})

        ctx = fresh_context()
        articles = Article.where(body="Generated text").fetch(ctx)
        assert len(articles) == 1

Complete conftest.py template¶

A recommended starting point for any Trails app test suite:

"""conftest.py — Trails app test setup."""
import pytest
from trails.testing import isolated_kernel, mock_llm, fresh_context

pytest_plugins = ["trails.testing"]

@pytest.fixture(autouse=True)
def _isolated():
    """Every test gets a clean registry."""
    with isolated_kernel():
        yield

@pytest.fixture
def ctx():
    """A ready-to-use Context for KG operations."""
    return fresh_context()

@pytest.fixture
def llm():
    """A mock LLM returning a default response."""
    with mock_llm("test response") as client:
        yield client

End-to-end example¶

A capability that calls an LLM and writes the result to the KG, tested with isolation, mocking, and event capture:

from trails import capability, node_type
from trails.runtime import invoke
from trails.testing import capture_events, isolated_kernel, mock_llm

def test_summarize_writes_note_and_emits_completion():
    with isolated_kernel(), mock_llm("TL;DR: fine") as llm:
        @node_type("Summary", fields={"text": str})
        class Summary: ...

        @capability(id="note.summarize")
        def summarize(ctx, text: str) -> dict:
            reply = ctx.llm.complete(
                [{"role": "user", "content": text}]
            ).text
            s = Summary(text=reply)
            ctx.kg.add(s)
            return {"id": s.id, "summary": reply}

        with capture_events() as events:
            out = invoke("note.summarize", {"text": "..."})

        assert out["summary"] == "TL;DR: fine"
        assert llm.calls, "mock was not invoked"
        assert any(k == "capability_completed" for k, _ in events)

API reference¶

Symbol	Kind	Purpose
`isolated_kernel()`	context manager	Snapshot + clear + restore `_handlers`, `_NODE_TYPES`, `_SHAPES` registries
`mock_llm(response_or_fn, *, fail_first=0, model="mock:canned")`	context manager	Install mock `LLMClient` via `Context.set_llm_override`; yields client
`capture_events()`	context manager	Collect every `(kind, event)` tuple emitted inside the block; yields list
`fresh_context(principal="did:local:test", trace_id=None)`	factory	Build `Context` bound to kernel store for direct `ctx.kg` / `Model.find`
`register_principal_role(principal, role)`	function	Set `{"role": role}` on principal attrs for Cedar policy eval
`trails_isolated`	pytest fixture	Per-test `isolated_kernel()`. Requires `pytest_plugins = ["trails.testing"]`
`trails_mock_llm`	pytest fixture	Per-test `mock_llm()` with indirect parametrisation support

End-to-end test suite and traceability matrix¶

The python/tests/e2e/ directory contains cross-cutting E2E tests that exercise full request lifecycles across subsystems (ORM, capabilities, agents, ingestion, RML, federation, auto-ontology). Each test is tagged with requirement IDs via @pytest.mark.traces("REQ-ID").

Run the suite with traceability output:

pytest python/tests/e2e/ --trace-matrix

This generates two artifacts:

python/tests/e2e/traceability-matrix.html — visual coverage report showing which requirements are covered, test status, duration, and exercised subsystems.
python/tests/e2e/traceability-matrix.json — machine-readable equivalent for CI integration.

Requirements are declared in python/tests/e2e/requirements.yaml. Tests without a traces marker still run but do not appear in the matrix. The traceability plugin (python/tests/e2e/traceability.py) is a standard pytest plugin — add it to conftest.py or use pytest_plugins = ["tests.e2e.traceability"].