ADR-0004: Query-time reasoning, opt-in, cached¶

Status: Accepted
Date: 2026-04-12

Context¶

RDFS / OWL-RL entailment lets a triple store answer questions the explicit data doesn't directly contain — e.g., "is :Patient1 a schema:Person?" derived from :Patient rdfs:subClassOf schema:Person. Without reasoning, queries miss these implications.

Reasoning has a cost. Materializing entailments eagerly on every write can 10× the write size. Running them on every query can 10× query latency. Getting this wrong kills framework usability.

Placement options:

Eager, pre-handler. On every write, materialize OWL-RL closure. Handlers see fully-closed graph. Storage and write-latency impact severe.
Eager, background. Writes are fast; closure materialized asynchronously. Simpler to reason about but stale-data window.
Query-time, per-query. Closure computed on demand for each SPARQL query. Slow for every query.
Query-time with cache + invalidation. Closure computed once per graph, cached in :inferred named graph, invalidated on premise-graph writes. Lazy but fast once warm.
None by default, opt-in per capability. Frame-level: most handlers don't need reasoning; those that do declare it.

Decision¶

Combination: #4 (query-time, cached, per-named-graph) + #5 (opt-in per capability).

Default reasoning mode for a capability is none.
Capabilities opt in via @capability(reasoning="rdfs" | "owl-rl").
When a reasoning-enabled query executes, the kernel checks whether the :inferred/<graph> cache is current. If yes, query runs against <graph> UNION :inferred/<graph>. If no, closure is computed, stored, and then query runs.
Any write to <graph> marks :inferred/<graph> stale.
Cache warms lazily; there is no background reasoner daemon in v1.

Consequences¶

Positive¶

Zero perf cost for reasoning-unaware capabilities. Most handlers pay nothing.
Developers must think about reasoning to use it. Forces the question "do I actually need entailment here?" which is almost always no.
Cache amortizes cost over read-heavy workloads. Typical semweb apps read far more than they write.
Named-graph scoping means a noisy write graph doesn't invalidate reasoning over other graphs.

Negative¶

Cache invalidation is genuinely hard. Writes that affect ontology-level premises (new rdfs:subClassOf) may require broader invalidation than naive per-graph approach. Mitigated by treating ontology updates as global invalidations (explicit operation).
First query after invalidation is slow. Mitigated by optional pre-warming in trails onto evolve.
Debugging inference issues requires inspecting :inferred graph contents. Mitigated by trails trace showing which inferred triples were used.

Non-consequences¶

Apps that don't opt in never see inference behavior change.
Reasoner implementation is abstracted behind a trait — swap OWL-RL for RDFS or SWRL without framework changes.

Deferred¶

Incremental reasoning (recompute only affected portion on write) is v2+. v1 recomputes full closure on invalidation.
Backward-chaining (reason at query plan time without materialization) is v2+ exploration.

Revisit conditions¶

If real apps frequently need reasoning on hot paths, add an eager-materialize mode as an opt-in alternative.
If OWL-RL proves insufficient (e.g., users need OWL-DL expressivity), evaluate embedding a DL reasoner (HermiT, Pellet) — likely v2+.