Benchmarked the GraphRAG SDK against eight other GraphRAG and RAG systems on the GraphRAG-Bench Novel dataset.

The evaluation covers 2,010 questions across four task types: Fact Retrieval, Complex Reasoning, Contextual Summarization, and Creative Generation.

All tests ran on a MacBook Air (Apple M3, 24 GB) using GPT-4o-mini via Azure OpenAI for both answer generation and scoring.

Query Set

The evaluation runs against 2,000 questions drawn from the dataset. Here are two representative examples:

"In the narrative of 'An Unsentimental Journey through Cornwall', which plant known scientifically as Erica vagans is also referred to by another common name, and what is that name?"
"Within the account of the royal visit to St. Michael's Mount in Cornwall, who is identified as the person who married Princess Frederica of Hanover?"

>Disclosure: affiliated with FalkorDB and sharing our open-source work to collect feedback. Drop a star if you found it useful, thank you

u/Lower_Associate_8798