Built an agent memory upgrade — went from flat context to a 5-tier hybrid arch. Thoughts?
I've been running an AI coding agent (Hermes Agent, open source) and hit the usual wall: the built-in persistent memory is capped at ~2,200 chars. Fine for quick facts, useless for anything that scales.
So I layered it up into a 5-tier hybrid memory architecture:
Tier 1 — RAM (Built-in, ~2.2K chars)
Short-term working memory. Pipe-syntax compression, TTL flags per entry. Only the hottest ~15 facts live here.
Tier 2a — Vector DB (Chroma, 7,600+ embeddings)
Semantic search over everything the agent has ever stored. Good for fuzzy recall ("that thing about the dental case").
Tier 2b — Knowledge Graph (triple store, 18 relations, 25 entities)
This is what I just added. Structured relations: user → lives_in → city, project → port → 8223, bot → developed_for → person. KG queries are instant and precise — no embedding drift.
Tier 3 — Full-text session DB (SQLite FTS5)
Every message ever sent, searchable with boolean queries. Auto-saved, no management needed.
Tier 4 — Obsidian vault (user-readable markdown)
Mirror of key facts in editable .md files. The human can read/edit without any tooling.
New hybrid search cascade:
KG query → FTS5 session search → Vector search → Obsidian read
Plus a session diary — after every complex session, the agent writes an AAAK-compressed log entry to the vector DB for cross-session context.
---
What's your take on this? Overkill? Missing something obvious? Anybody else running a multi-tier memory setup for their agents? Would love to hear what's working (or not) for others.