u/sokomania

I've been running an AI coding agent (Hermes Agent, open source) and hit the usual wall: the built-in persistent memory is capped at ~2,200 chars. Fine for quick facts, useless for anything that scales.

So I layered it up into a 5-tier hybrid memory architecture:

Tier 1 — RAM (Built-in, ~2.2K chars)

Short-term working memory. Pipe-syntax compression, TTL flags per entry. Only the hottest ~15 facts live here.

Tier 2a — Vector DB (Chroma, 7,600+ embeddings)

Semantic search over everything the agent has ever stored. Good for fuzzy recall ("that thing about the dental case").

Tier 2b — Knowledge Graph (triple store, 18 relations, 25 entities)

This is what I just added. Structured relations: user → lives_in → city, project → port → 8223, bot → developed_for → person. KG queries are instant and precise — no embedding drift.

Tier 3 — Full-text session DB (SQLite FTS5)

Every message ever sent, searchable with boolean queries. Auto-saved, no management needed.

Tier 4 — Obsidian vault (user-readable markdown)

Mirror of key facts in editable .md files. The human can read/edit without any tooling.

New hybrid search cascade:

KG query → FTS5 session search → Vector search → Obsidian read

Plus a session diary — after every complex session, the agent writes an AAAK-compressed log entry to the vector DB for cross-session context.

---

What's your take on this? Overkill? Missing something obvious? Anybody else running a multi-tier memory setup for their agents? Would love to hear what's working (or not) for others.

Built an agent memory upgrade — went from flat context to a 5-tier hybrid arch. Thoughts?