ADR-0031: Federation Ontology Exchange¶

Status: Accepted
Date: 2026-04-18

Context¶

Trails federation (ADR-0023) allows instances to query each other via SPARQL SERVICE clauses and invoke remote capabilities via MCP relay. However, federated queries are currently blind: Instance A has no idea what node types, predicates, or shapes Instance B exposes. A user writing a SERVICE query against a peer must know the peer's schema out of band — there is no discovery mechanism.

This creates several problems:

Fragile queries. Federated SPARQL queries reference remote IRIs by hardcoded strings. A schema change on the remote peer silently breaks queries.
No tooling support. CLI and IDE tooling cannot autocomplete or validate predicates when the schema is unknown.
Opaque mesh. The mesh manager (Phase 4) tracks health but not what each peer contains. An operator looking at trails federation peers sees URLs and latencies but not what data lives behind them.

Other knowledge-graph ecosystems (Solid, DCAT, VoID) solve this with schema/dataset description documents. Trails needs an equivalent that fits its progressive-enhancement model (ADR-0021) and integrates with the existing federation, MCP, and CLI surfaces.

Decision¶

Implement federation ontology exchange in three levels. This ADR covers Levels 1 and 2; Level 3 is deferred.

Level 1: Schema Advertisement¶

Each Trails instance exposes a schema advertisement — a JSON document listing:

Registered @node_type definitions (name, IRI, fields, types)
SHACL constraints from @shape declarations
Registered capability IDs
Data-discovered predicates (from the store, via SPARQL aggregation)
Instance metadata (name, base IRI, Trails version, generation timestamp)

The advertisement is served at:

HTTP: GET /schema on the FastAPI adapter (JSON response)
MCP: trails://schema resource (JSON text)

Level 2: Schema Discovery¶

Instances can fetch and cache remote peer schemas:

fetch_peer_schema(peer_url) fetches /schema from a peer
SchemaCache provides TTL-based in-memory caching (default 5 min)
MeshManager.schema_cache is populated during discover_peers()
discover_peers() includes a "schema" key in each peer dict

Level 3: Schema Alignment (deferred)¶

Negotiation of shared vocabularies between peers — e.g. mapping trails://app-a/Patient to trails://app-b/Subject via owl:sameAs or SKOS mappings. This requires consensus protocols and is out of scope for this increment.

CLI Surface¶

trails federation schema — show local schema advertisement
trails federation schema --peer warehouse — fetch and show a remote peer's schema
trails federation schema --json — raw JSON output

Module Structure¶

New module: trails.federation_schema containing:

SchemaAdvertisement, NodeTypeInfo, PredicateInfo — dataclasses
build_local_schema() — combines ORM, shapes, capabilities, and store data into an advertisement
schema_to_json() / schema_from_json() — serialization
fetch_peer_schema() — HTTP fetch with error handling
SchemaCache — thread-safe TTL cache

Integration points (minimal changes to existing modules):

http_adapter.py — new GET /schema route
mcp_resources.py — register_schema_resource() for trails://schema
federation_mesh.py — SchemaCache on MeshManager, populated during discover_peers()
cli/federation.py — schema subcommand

Consequences¶

Positive¶

Federated queries are no longer blind — peers can inspect each other's schemas before writing SERVICE clauses.
CLI tooling can display what a peer offers (trails federation schema --peer X), reducing trial-and-error.
The mesh view becomes richer: operators see not just "is it up?" but "what does it have?"
Schema caching avoids redundant fetches during health check rounds.
The advertisement is purely additive: instances that don't expose /schema simply return 404 and federation continues to work as before.

Negative¶

Schema advertisements reveal the internal structure of an instance. In security-sensitive deployments, the /schema endpoint should be gated by Cedar policy (future work).
The advertisement is a point-in-time snapshot. Dynamic schema changes (new @node_type registered at runtime) are only visible after the next fetch. The TTL cache mitigates staleness but does not eliminate it.
Level 3 alignment is deferred, so cross-instance vocabulary mapping remains manual.

Risks¶

Schema drift. If a peer changes its schema between advertisement fetches, cached information becomes stale. The TTL (default 5 min) bounds the staleness window.
Large schemas. An instance with hundreds of node types produces a large JSON document. Pagination is not implemented in L1/L2 but can be added if needed.