u/ReplyFeisty4409 — reddlx

Been struggling with a specific RAG failure mode: collections of similar documents (invoices, contracts, receipts) where every document looks alike and the questions are aggregations, not searches.

"Total unpaid invoices from last quarter": a vector search returns chunks from random documents, not an answer. The more homogeneous the collection, the worse RAG performs.

The approach that worked for me: treat the LLM as a parser, not as the retrieval layer. Define the fields you want, extract them once per document into typed records, store in a database, query with real filters and aggregations. No embeddings, no similarity search.

Curious if others have hit this specific failure mode and how you handled it. Did you work around it within RAG (reranking, metadata filtering, hybrid search) or moved to a different approach entirely?

(I built an OSS tool around this pattern: https://github.com/sifter-ai/sifter, there's also a paid cloud version. Disclosure: I'm the author.)