What is GraphRAG?
TL;DR
RAG variant combining knowledge graphs with vector search. Proposed by Microsoft in 2024, it dramatically outperforms traditional RAG on complex relational and global queries.
GraphRAG: Definition & Explanation
GraphRAG is an evolved RAG (Retrieval-Augmented Generation) approach proposed by Microsoft Research in 2024. It addresses two major weaknesses of vector-only RAG: inability to perform complex relational reasoning, and weakness on global/holistic queries about large document collections. The pipeline: (1) at ingest time, an LLM extracts entities (people, organizations, concepts) and relationships (associations, citations, causes) from documents; (2) build a knowledge graph (nodes = entities, edges = relationships); (3) run clustering / community detection (e.g., Leiden algorithm); (4) generate hierarchical summaries per cluster with an LLM; (5) at query time, fuse vector search + graph traversal + hierarchical summaries. Benefits: (a) 2-3x accuracy on global queries vs traditional RAG; (b) supports multi-hop reasoning; (c) preserves entity relationships ("how are A and B connected?"); (d) explainability (which edges/nodes were referenced). Implementations: Microsoft GraphRAG (OSS Python lib), Neo4j + LangChain (industrial), LlamaIndex Knowledge Graph, Weaviate + Property Graph. Cost is 2-3x higher than traditional RAG due to graph construction LLM calls, but the accuracy gains often justify it. Use cases: enterprise knowledge base search; complex medical, legal, financial documents; research paper citation analysis; compliance and audit log analysis. Microsoft, AWS, Google, and Anthropic are moving to embed GraphRAG into mainstream LLM APIs in 2025-2026, positioning it as the new RAG architecture standard for 2026.