ADR-0074: HippoRAG 2 / GraphRAG Agent Memory Backend¶
- Status: Accepted
- Date: 2026-05-25
- Extends: ADR-0019 (App surface — ingestion, vector, admin UI), ADR-0018 (Agent runtime)
- Tracks:
trails.vector.hipporag
Context¶
Trails ships trails.vector (SqliteVec + Qdrant adapters, hybrid dense/sparse retrieval) and trails.agent (ReAct / Plan-and-Execute / Reflexion planners). Multi-hop question answering — "which patients share a diagnosis with a patient who also has condition Y?" — is the critical bottleneck for agentic KG applications. Flat vector similarity retrieves individually relevant passages but fails to chain them across associative hops.
Research baseline: HippoRAG (Jiménez Gutiérrez et al., NeurIPS 2024) demonstrated that layering a Personalized PageRank (PPR) hippocampal index over a knowledge graph built from corpus entity/relation extraction achieves a +7% gain over state-of-the-art embeddings on associative multi-hop QA benchmarks. HippoRAG 2 (Gutiérrez et al., ICML 2025) extended the approach with improved passage-level integration, achieving 10–30× lower cost than IRCoT (iterative retrieval with chain-of-thought) while matching or exceeding its accuracy.
Gap in Trails today:
trails.vector.retrieve()supportsmode="dense",mode="sparse",mode="hybrid"but has no graph-structured retrieval path.- KG neighbourhood expansion (
ctx.kg.subgraph(), shipped in M29) requires known seed IRIs; it does not propagate through synonym/coreference links built from unstructured corpus passages. - Agents performing multi-hop QA over ingested document collections (PDFs, HTML, markdown via
trails.ingest) lack a retrieval mode that mirrors the associative, graph-structured memory of the hippocampus.
Related work surveyed:
- LightRAG (Guo et al., EMNLP 2025 Findings) introduced a dual-level entity+relation graph with incremental update support that avoids full reprocessing on new document ingestion.
- KAG (Liang et al., ACM Web Conf 2025) demonstrated that knowledge graph augmentation with SHACL-like structural constraints significantly reduces hallucination in complex QA.
- Think-on-Graph 2.0 (Ma et al., ICLR 2025) shows beam-search over KG entity triples outperforms both pure retrieval and pure generation for structured multi-hop reasoning.
- GraphRAG Survey (Peng et al., ACM TOIS 2025) catalogues graph-based retrieval as the current state-of-the-art direction for agentic memory systems.
Decision¶
Introduce trails.vector.hipporag as a new optional retrieval backend. It co-exists with the existing SqliteVec/Qdrant backends and is activated by passing mode="hippo" to the existing retrieve() API surface.
Module: trails/vector/hipporag.py¶
HippoIndex¶
The central class manages the synonym graph, entity-to-passage index, and PPR computation.
class HippoIndex:
"""Hippocampal index: synonym graph + PPR retrieval over entity/relation KG."""
def __init__(self, *, llm=None, entity_extractor=None):
"""
Parameters
----------
llm : optional LLM client (trails.llm)
When provided, entity/relation extraction uses LLM prompting.
When None, falls back to regex-based extraction (offline-safe).
entity_extractor : optional callable
Custom extraction function: (text: str) -> list[Entity]
Overrides both LLM and regex defaults.
"""
def add_passage(self, passage_id: str, text: str, *, metadata: dict = None) -> None:
"""
Add a single passage to the index with incremental update.
Extracts entities and relations from text, adds them to the synonym
graph, and records the passage-to-entity mapping. Does NOT require
full reprocessing of existing passages (LightRAG EMNLP 2025 pattern).
Idempotent: re-adding an existing passage_id updates its metadata
without duplicating graph nodes.
"""
def build_from_passages(self, passages: list[dict]) -> None:
"""Batch-build from a list of {id, text, metadata} dicts."""
def retrieve(
self,
query: str,
*,
top_k: int = 10,
ppr_alpha: float = 0.15,
max_iter: int = 50,
tol: float = 1e-6,
) -> list[RetrievedPassage]:
"""
PPR-based multi-hop retrieval.
Steps:
1. Extract query entities (same extractor as add_passage).
2. Seed PPR random-walk from query-entity nodes.
3. Run power-iteration until convergence (tol) or max_iter.
4. Return top_k passages ranked by summed PPR scores of their entities.
"""
def synonym_graph(self) -> dict:
"""Return the synonym graph as adjacency dict (for inspection/serialisation)."""
def save(self, path: str) -> None:
"""Persist index to disk (JSON + numpy arrays)."""
@classmethod
def load(cls, path: str) -> "HippoIndex":
"""Load a previously saved index."""
RetrievedPassage¶
@dataclass
class RetrievedPassage:
passage_id: str
text: str
ppr_score: float
entity_hits: list[str] # entities from this passage that matched the PPR walk
metadata: dict
Entity extraction — LLM-optional with regex fallback¶
class RegexEntityExtractor:
"""
Offline-safe extractor. Identifies capitalised noun phrases and
known domain patterns (dates, measurement values, IRI-like strings).
Suitable for CI, air-gapped, and low-cost deployments.
"""
class LLMEntityExtractor:
"""
LLM-driven extractor using trails.llm. Prompt: extract (entity, relation,
entity) triples from passage. Requires configured LLM client.
"""
Integration with existing retrieve() surface¶
The existing trails.vector.retrieve() function gains a mode="hippo" branch:
# Existing call sites unchanged
results = await retrieve(
query="patients who share diagnosis X and also have condition Y",
ctx=ctx,
mode="hippo", # new
hippo_index=my_index, # injected index instance
top_k=10,
)
The function signature remains backward-compatible: mode defaults to "hybrid" (existing behaviour). hippo_index= is only required when mode="hippo".
Incremental updates¶
Following the LightRAG EMNLP 2025 pattern, add_passage() supports incremental ingestion without rebuilding the full PPR matrix:
- Extract entities from the new passage.
- Find synonym links (edit-distance + embedding similarity when LLM available).
- Add new graph edges; patch affected PPR columns only (single-source PPR restart from new entity nodes).
- Update passage-to-entity mapping.
Full rebuild (build_from_passages()) is available for cold starts and for reindexing after bulk ingestion.
PPR computation¶
Uses iterative power-iteration over a sparse adjacency matrix (scipy.sparse, optional dep). The alpha parameter (default 0.15) controls the teleportation probability — lower alpha = more traversal, higher alpha = closer to query seeds.
Non-goals¶
- No replacement of existing SqliteVec/Qdrant backends.
mode="hippo"is additive. - No Neo4j/Memgraph/Dgraph storage backend. The synonym graph lives in-memory or serialised to JSON; the KG itself remains in Oxigraph (ADR-0007).
- LLM extraction is optional. The regex fallback must work fully offline and in CI without any LLM API key. Tests that use
LLMEntityExtractorare gated onTRAILS_TEST_LLM=1. - No distributed PPR computation in this milestone. Single-process, in-memory. Horizontal scale deferred to a future ADR.
Consequences¶
Positive¶
- Multi-hop QA accuracy. PPR-based retrieval follows associative chains through the synonym graph — passages linked via shared entities surface even when no passage individually contains the answer.
- Cost advantage. 10–30× cheaper than IRCoT for equivalent accuracy (HippoRAG 2 ICML 2025), since PPR replaces iterative LLM calls for retrieval decisions.
- Incremental updates.
add_passage()supports live document feeds without full reindex — compatible withtrails.ingeststreaming ingestion. - Offline-safe. Regex fallback means the index can be built and queried without LLM API access. Suitable for air-gapped and CI environments.
- Composable.
HippoIndexis independent of the agent planner — anyretrieve()call site gains multi-hop retrieval by switchingmode="hippo".
Negative¶
- Memory footprint. The synonym graph lives in-memory. For very large corpora (100k+ passages), scipy sparse matrices are required. For extremely large corpora, a future ADR should introduce a persistent graph backend.
- Extraction quality ceiling. The regex fallback produces lower-quality entity graphs than LLM extraction. Multi-hop accuracy degrades proportionally — users should be aware that offline mode trades accuracy for zero LLM cost.
- Optional dependency surface. scipy is required for efficient PPR; numpy is required for passage vectors. Both are added as
trails[hipporag]extras, not hard dependencies.
Non-consequences¶
- ADR-0019 (vector retrieval surface) is unchanged;
mode="hippo"is additive. - ADR-0018 (agent runtime) is unchanged; planners consume
retrieve()results without knowing which mode was used. - ADR-0007 (Oxigraph default store) is unchanged; the synonym graph is a separate in-memory structure, not stored in Oxigraph.
Revisit conditions¶
- If HippoRAG 3 or a successor paper significantly changes the PPR/synonym-graph architecture, evaluate alignment.
- If corpus sizes routinely exceed 500k passages, introduce a persistent graph backend (candidate: RDF-star in Oxigraph for synonym edges).
- If scipy proves too heavy for minimal installs, implement a pure-Python sparse-matrix fallback with documented accuracy trade-offs.
References¶
-
Jiménez Gutiérrez, B., Jiang, Y., Zhu, Y., Shi, W., Bhatt, R., Chen, Y., & Ji, H. (2024). HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models. Advances in Neural Information Processing Systems (NeurIPS 2024). arXiv:2405.14831.
-
Gutiérrez, B. J., Shen, J., Kim, Y., Han, J., Durrett, G., & Ji, H. (2025). HippoRAG 2: From RAG to Memory. International Conference on Machine Learning (ICML 2025). arXiv:2502.14802.
-
Guo, Z., Liang, L., Shi, C., Tang, J., Sun, M., Guo, J., & Li, J. (2025). LightRAG: Simple and Fast Retrieval-Augmented Generation. EMNLP 2025 Findings. arXiv:2410.05779.
-
Liang, L., Sun, M., Gui, Z., Zhu, Y., Jiang, Z., Zhong, Y., Leng, Y., Wang, B., Yang, C., Wan, L., Zhao, Z., & Li, L. (2025). KAG: Knowledge Augmented Generation. Companion Proceedings of the ACM Web Conference 2025 (WWW '25 Companion). DOI:10.1145/3701716.3715240. arXiv:2409.13731.
-
Ma, X., Sun, Y., Wei, Y., Yang, Z., Su, Y., & Chen, H. (2025). Think-on-Graph 2.0: Deep and Faithful Large Language Model Reasoning with Knowledge-Guided Retrieval Augmented Generation. International Conference on Learning Representations (ICLR 2025). arXiv:2407.10805.
-
Peng, B., Liu, Y., Shang, Z., Sun, L., Ji, H., Qiu, X., & Zhang, C. (2025). Graph Retrieval-Augmented Generation: A Survey. ACM Transactions on Information Systems (TOIS). DOI:10.1145/3777378.