【KDD 2026】MemGraphRAG: Memory-based Multi-Agent System for Graph Retrieval-Augmented Generation

A three-layer memory structure for knowledge graph retrieval and generation, designed to organize extracted information into a hierarchical memory system with inter-layer connections.

Overview

MemGraphRAG implements a three-layer memory architecture that bridges unstructured text passages with structured knowledge graphs:

Architecture

1. Entity Extraction (`entity_type_extract.py`)

Extracts named entities from text using spaCy's transformer-based model:

Splits text into chunks of fixed token length (default: 512 tokens)
Uses en_core_web_trf model for entity recognition
Outputs entities with text, label, start/end character positions

2. Relation Extraction (`schema_fact_extract.py`)

Extracts relations between entities using LLM:

Takes entity-extracted chunks as input
Uses LLM to identify meaningful relations between entities
Outputs relation triples: (head, relation, tail) with types: (head_type, relation, tail_type)
Supports batch parallel processing

3. LLM Client (`llm_client.py`)

Reusable LLM client with advanced features:

OpenAI-compatible API support
SQLite-based response caching
Parallel batch request processing
Automatic JSON response parsing
Configurable retry mechanism

4. Prompt System (`prompt.py`, `prompt_builder.py`)

Contains prompts for various tasks:

Relation Extraction: Extract structured triples from entities
Conflict Detection: Identifies three conflict types:
- Mutual conflict (one-to-one relations)
- Temporal conflict (time-dependent facts)
- Granularity conflict (different specificity levels)
Conflict Resolution: Resolves conflicts using source passages

5. Ontology Filtering (`ontology_filtering.py`)

Filters and enriches extracted knowledge:

Removes low-frequency ontologies
Adds metadata fields:
- unique_ontologies: Unique schema patterns per chunk
- entity_mapping: Type-entity correspondences

6. Three-Layer Memory (`memory.py`)

Core memory structure implementation:

Schema Layer: Stores ontology patterns (type triples)
Fact Layer: Stores extracted triples
Passage Layer: Stores original text chunks
Inter-layer index relationships for bidirectional navigation
Serialization support (JSON save/load)

7. Conflict Resolution (`resolve_conflict.py`)

Detects and resolves triple conflicts:

Finds related triples by shared entities
Uses embedding similarity to find semantically similar facts
LLM-based conflict classification
Resolution strategies: keep, discard, or modify

Installation

pip install spacy httpx openai filelock numpy pandas

# Download spaCy model
python -m spacy download en_core_web_trf

Usage

Step 1: Entity Extraction

from entity_type_extract import EntityExtractor

extractor = EntityExtractor(
    model_name="en_core_web_trf",
    chunk_size=512
)

extractor.process_file("input.txt", "output_entities.json")

Step 2: Relation Extraction

from schema_fact_extract import RelationExtractor
from llm_client import LLMConfig

llm_config = LLMConfig(
    model_name="gpt-4o-mini",
    temperature=0.0
)

extractor = RelationExtractor(llm_config=llm_config)
extractor.process_file("output_entities.json", "output_relations.json")

Step 3: Build Memory Structure

from memory import ThreeLayerMemory, load_openie_results

data = load_openie_results("filtered_results.json")
memory = ThreeLayerMemory()
memory.build_from_openie_results(data)

# Save memory
memory.save("memory.json")

Step 4: Conflict Detection & Resolution

from resolve_conflict import detect_triple_conflicts, load_all_triples_with_ids

triple_list, triple_ids = load_all_triples_with_ids("openie_results.json")

result = detect_triple_conflicts(
    triple_list=triple_list,
    triple_ids=triple_ids,
    llm_model=llm_model,
    embedding_model=embedding_model,
    fact_id_to_fact=fact_id_to_fact
)

Key Features

Hierarchical Organization: Three-layer structure from abstract schemas to concrete passages
Bidirectional Indexing: Navigate between layers via index relationships
Semantic Search: Vector-based similarity search across facts
Conflict Resolution: Automated detection and resolution of contradictory facts
Caching: SQLite-based caching for LLM responses
Parallel Processing: Efficient batch processing for large datasets

Dependencies

Python 3.8+
spaCy (with transformer model)
OpenAI SDK
NumPy
Pandas
httpx
filelock

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
dataset		dataset
src		src
README.md		README.md
framework.png		framework.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

【KDD 2026】MemGraphRAG: Memory-based Multi-Agent System for Graph Retrieval-Augmented Generation

Overview

Architecture

1. Entity Extraction (`entity_type_extract.py`)

2. Relation Extraction (`schema_fact_extract.py`)

3. LLM Client (`llm_client.py`)

4. Prompt System (`prompt.py`, `prompt_builder.py`)

5. Ontology Filtering (`ontology_filtering.py`)

6. Three-Layer Memory (`memory.py`)

7. Conflict Resolution (`resolve_conflict.py`)

Installation

Usage

Step 1: Entity Extraction

Step 2: Relation Extraction

Step 3: Build Memory Structure

Step 4: Conflict Detection & Resolution

Key Features

Dependencies

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

【KDD 2026】MemGraphRAG: Memory-based Multi-Agent System for Graph Retrieval-Augmented Generation

Overview

Architecture

1. Entity Extraction (entity_type_extract.py)

2. Relation Extraction (schema_fact_extract.py)

3. LLM Client (llm_client.py)

4. Prompt System (prompt.py, prompt_builder.py)

5. Ontology Filtering (ontology_filtering.py)

6. Three-Layer Memory (memory.py)

7. Conflict Resolution (resolve_conflict.py)

Installation

Usage

Step 1: Entity Extraction

Step 2: Relation Extraction

Step 3: Build Memory Structure

Step 4: Conflict Detection & Resolution

Key Features

Dependencies

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. Entity Extraction (`entity_type_extract.py`)

2. Relation Extraction (`schema_fact_extract.py`)

3. LLM Client (`llm_client.py`)

4. Prompt System (`prompt.py`, `prompt_builder.py`)

5. Ontology Filtering (`ontology_filtering.py`)

6. Three-Layer Memory (`memory.py`)

7. Conflict Resolution (`resolve_conflict.py`)

Packages