Skip to content

ADR-0074: HippoRAG 2 / GraphRAG Agent Memory Backend

  • Status: Accepted
  • Date: 2026-05-25
  • Extends: ADR-0019 (App surface — ingestion, vector, admin UI), ADR-0018 (Agent runtime)
  • Tracks: trails.vector.hipporag

Context

Trails ships trails.vector (SqliteVec + Qdrant adapters, hybrid dense/sparse retrieval) and trails.agent (ReAct / Plan-and-Execute / Reflexion planners). Multi-hop question answering — "which patients share a diagnosis with a patient who also has condition Y?" — is the critical bottleneck for agentic KG applications. Flat vector similarity retrieves individually relevant passages but fails to chain them across associative hops.

Research baseline: HippoRAG (Jiménez Gutiérrez et al., NeurIPS 2024) demonstrated that layering a Personalized PageRank (PPR) hippocampal index over a knowledge graph built from corpus entity/relation extraction achieves a +7% gain over state-of-the-art embeddings on associative multi-hop QA benchmarks. HippoRAG 2 (Gutiérrez et al., ICML 2025) extended the approach with improved passage-level integration, achieving 10–30× lower cost than IRCoT (iterative retrieval with chain-of-thought) while matching or exceeding its accuracy.

Gap in Trails today:

  • trails.vector.retrieve() supports mode="dense", mode="sparse", mode="hybrid" but has no graph-structured retrieval path.
  • KG neighbourhood expansion (ctx.kg.subgraph(), shipped in M29) requires known seed IRIs; it does not propagate through synonym/coreference links built from unstructured corpus passages.
  • Agents performing multi-hop QA over ingested document collections (PDFs, HTML, markdown via trails.ingest) lack a retrieval mode that mirrors the associative, graph-structured memory of the hippocampus.

Related work surveyed:

  • LightRAG (Guo et al., EMNLP 2025 Findings) introduced a dual-level entity+relation graph with incremental update support that avoids full reprocessing on new document ingestion.
  • KAG (Liang et al., ACM Web Conf 2025) demonstrated that knowledge graph augmentation with SHACL-like structural constraints significantly reduces hallucination in complex QA.
  • Think-on-Graph 2.0 (Ma et al., ICLR 2025) shows beam-search over KG entity triples outperforms both pure retrieval and pure generation for structured multi-hop reasoning.
  • GraphRAG Survey (Peng et al., ACM TOIS 2025) catalogues graph-based retrieval as the current state-of-the-art direction for agentic memory systems.

Decision

Introduce trails.vector.hipporag as a new optional retrieval backend. It co-exists with the existing SqliteVec/Qdrant backends and is activated by passing mode="hippo" to the existing retrieve() API surface.

Module: trails/vector/hipporag.py

HippoIndex

The central class manages the synonym graph, entity-to-passage index, and PPR computation.

class HippoIndex:
    """Hippocampal index: synonym graph + PPR retrieval over entity/relation KG."""

    def __init__(self, *, llm=None, entity_extractor=None):
        """
        Parameters
        ----------
        llm : optional LLM client (trails.llm)
            When provided, entity/relation extraction uses LLM prompting.
            When None, falls back to regex-based extraction (offline-safe).
        entity_extractor : optional callable
            Custom extraction function: (text: str) -> list[Entity]
            Overrides both LLM and regex defaults.
        """

    def add_passage(self, passage_id: str, text: str, *, metadata: dict = None) -> None:
        """
        Add a single passage to the index with incremental update.

        Extracts entities and relations from text, adds them to the synonym
        graph, and records the passage-to-entity mapping. Does NOT require
        full reprocessing of existing passages (LightRAG EMNLP 2025 pattern).

        Idempotent: re-adding an existing passage_id updates its metadata
        without duplicating graph nodes.
        """

    def build_from_passages(self, passages: list[dict]) -> None:
        """Batch-build from a list of {id, text, metadata} dicts."""

    def retrieve(
        self,
        query: str,
        *,
        top_k: int = 10,
        ppr_alpha: float = 0.15,
        max_iter: int = 50,
        tol: float = 1e-6,
    ) -> list[RetrievedPassage]:
        """
        PPR-based multi-hop retrieval.

        Steps:
        1. Extract query entities (same extractor as add_passage).
        2. Seed PPR random-walk from query-entity nodes.
        3. Run power-iteration until convergence (tol) or max_iter.
        4. Return top_k passages ranked by summed PPR scores of their entities.
        """

    def synonym_graph(self) -> dict:
        """Return the synonym graph as adjacency dict (for inspection/serialisation)."""

    def save(self, path: str) -> None:
        """Persist index to disk (JSON + numpy arrays)."""

    @classmethod
    def load(cls, path: str) -> "HippoIndex":
        """Load a previously saved index."""

RetrievedPassage

@dataclass
class RetrievedPassage:
    passage_id: str
    text: str
    ppr_score: float
    entity_hits: list[str]   # entities from this passage that matched the PPR walk
    metadata: dict

Entity extraction — LLM-optional with regex fallback

class RegexEntityExtractor:
    """
    Offline-safe extractor. Identifies capitalised noun phrases and
    known domain patterns (dates, measurement values, IRI-like strings).
    Suitable for CI, air-gapped, and low-cost deployments.
    """

class LLMEntityExtractor:
    """
    LLM-driven extractor using trails.llm. Prompt: extract (entity, relation,
    entity) triples from passage. Requires configured LLM client.
    """

Integration with existing retrieve() surface

The existing trails.vector.retrieve() function gains a mode="hippo" branch:

# Existing call sites unchanged
results = await retrieve(
    query="patients who share diagnosis X and also have condition Y",
    ctx=ctx,
    mode="hippo",          # new
    hippo_index=my_index,  # injected index instance
    top_k=10,
)

The function signature remains backward-compatible: mode defaults to "hybrid" (existing behaviour). hippo_index= is only required when mode="hippo".

Incremental updates

Following the LightRAG EMNLP 2025 pattern, add_passage() supports incremental ingestion without rebuilding the full PPR matrix:

  1. Extract entities from the new passage.
  2. Find synonym links (edit-distance + embedding similarity when LLM available).
  3. Add new graph edges; patch affected PPR columns only (single-source PPR restart from new entity nodes).
  4. Update passage-to-entity mapping.

Full rebuild (build_from_passages()) is available for cold starts and for reindexing after bulk ingestion.

PPR computation

Uses iterative power-iteration over a sparse adjacency matrix (scipy.sparse, optional dep). The alpha parameter (default 0.15) controls the teleportation probability — lower alpha = more traversal, higher alpha = closer to query seeds.

Non-goals

  • No replacement of existing SqliteVec/Qdrant backends. mode="hippo" is additive.
  • No Neo4j/Memgraph/Dgraph storage backend. The synonym graph lives in-memory or serialised to JSON; the KG itself remains in Oxigraph (ADR-0007).
  • LLM extraction is optional. The regex fallback must work fully offline and in CI without any LLM API key. Tests that use LLMEntityExtractor are gated on TRAILS_TEST_LLM=1.
  • No distributed PPR computation in this milestone. Single-process, in-memory. Horizontal scale deferred to a future ADR.

Consequences

Positive

  • Multi-hop QA accuracy. PPR-based retrieval follows associative chains through the synonym graph — passages linked via shared entities surface even when no passage individually contains the answer.
  • Cost advantage. 10–30× cheaper than IRCoT for equivalent accuracy (HippoRAG 2 ICML 2025), since PPR replaces iterative LLM calls for retrieval decisions.
  • Incremental updates. add_passage() supports live document feeds without full reindex — compatible with trails.ingest streaming ingestion.
  • Offline-safe. Regex fallback means the index can be built and queried without LLM API access. Suitable for air-gapped and CI environments.
  • Composable. HippoIndex is independent of the agent planner — any retrieve() call site gains multi-hop retrieval by switching mode="hippo".

Negative

  • Memory footprint. The synonym graph lives in-memory. For very large corpora (100k+ passages), scipy sparse matrices are required. For extremely large corpora, a future ADR should introduce a persistent graph backend.
  • Extraction quality ceiling. The regex fallback produces lower-quality entity graphs than LLM extraction. Multi-hop accuracy degrades proportionally — users should be aware that offline mode trades accuracy for zero LLM cost.
  • Optional dependency surface. scipy is required for efficient PPR; numpy is required for passage vectors. Both are added as trails[hipporag] extras, not hard dependencies.

Non-consequences

  • ADR-0019 (vector retrieval surface) is unchanged; mode="hippo" is additive.
  • ADR-0018 (agent runtime) is unchanged; planners consume retrieve() results without knowing which mode was used.
  • ADR-0007 (Oxigraph default store) is unchanged; the synonym graph is a separate in-memory structure, not stored in Oxigraph.

Revisit conditions

  • If HippoRAG 3 or a successor paper significantly changes the PPR/synonym-graph architecture, evaluate alignment.
  • If corpus sizes routinely exceed 500k passages, introduce a persistent graph backend (candidate: RDF-star in Oxigraph for synonym edges).
  • If scipy proves too heavy for minimal installs, implement a pure-Python sparse-matrix fallback with documented accuracy trade-offs.

References

  1. Jiménez Gutiérrez, B., Jiang, Y., Zhu, Y., Shi, W., Bhatt, R., Chen, Y., & Ji, H. (2024). HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models. Advances in Neural Information Processing Systems (NeurIPS 2024). arXiv:2405.14831.

  2. Gutiérrez, B. J., Shen, J., Kim, Y., Han, J., Durrett, G., & Ji, H. (2025). HippoRAG 2: From RAG to Memory. International Conference on Machine Learning (ICML 2025). arXiv:2502.14802.

  3. Guo, Z., Liang, L., Shi, C., Tang, J., Sun, M., Guo, J., & Li, J. (2025). LightRAG: Simple and Fast Retrieval-Augmented Generation. EMNLP 2025 Findings. arXiv:2410.05779.

  4. Liang, L., Sun, M., Gui, Z., Zhu, Y., Jiang, Z., Zhong, Y., Leng, Y., Wang, B., Yang, C., Wan, L., Zhao, Z., & Li, L. (2025). KAG: Knowledge Augmented Generation. Companion Proceedings of the ACM Web Conference 2025 (WWW '25 Companion). DOI:10.1145/3701716.3715240. arXiv:2409.13731.

  5. Ma, X., Sun, Y., Wei, Y., Yang, Z., Su, Y., & Chen, H. (2025). Think-on-Graph 2.0: Deep and Faithful Large Language Model Reasoning with Knowledge-Guided Retrieval Augmented Generation. International Conference on Learning Representations (ICLR 2025). arXiv:2407.10805.

  6. Peng, B., Liu, Y., Shang, Z., Sun, L., Ji, H., Qiu, X., & Zhang, C. (2025). Graph Retrieval-Augmented Generation: A Survey. ACM Transactions on Information Systems (TOIS). DOI:10.1145/3777378.