u/snurss

How would you build a local PubMed/PMC-style search + QA system over a private local corpus?

I have a large local PMC/PubMed corpus on SSD and want to build a fully local system on my workstation that behaves somewhat like PubMed search, but can also answer questions over the local corpus with grounded references.

Hardware: RTX 5090, Ryzen 9 9950X3D, 96 GB RAM.

I already have the corpus parsed locally and partially indexed.

If you were building this today, what exact local setup would you use for:

  • retriever
  • reranker
  • local LLM
  • FAISS or something else
  • framework vs fully custom pipeline

I’m especially interested in responses from people who have actually built a local biomedical literature search / RAG system.

Thank you

reddit.com
u/snurss — 15 hours ago