
I built a local memory server for AI that’s just a single binary
Been working on this for a while and finally open sourced it. Every time I start a new chat my AI has amnesia. Cloud memory services charge insane prices for something that should just run on your machine.
modus-memory is a Go binary (~6MB) that gives any MCP-compatible client (Claude Desktop, Cursor, Cline, whatever) persistent memory. Everything stored as plain markdown files you can grep, edit in VS Code, or back up with git. No SQLite, no Postgres, no Docker, no Python.
What’s under the hood:
∙ BM25 search with field boosting and query caching (cold searches in <5ms, cached in microseconds)
∙ FSRS memory decay — same algorithm Anki uses. Stuff you never look at fades. Stuff you keep referencing gets stronger. Keeps the vault clean over time instead of becoming a junk drawer
∙ Cross-referencing — search for “authentication” and it also surfaces related facts, entities, and notes that share subjects/tags even if they don’t contain the keyword
∙ If you run llama-server or any OpenAI-compatible endpoint locally on port 8090 it’ll use your model for query expansion. Completely optional
There’s a free tier (1K docs, full search) and a $10/mo tier that unlocks the decay, cross-refs, and unlimited docs. Honestly still figuring out the right split so I’m open to opinions on that.
Also built a Khoj importer for anyone affected by their cloud shutting down on the 15th. One command converts your export into searchable markdown.
Happy to answer questions about the implementation. The BM25 and FSRS stuff was the most interesting part to build if anyone wants to nerd out about that