
I've been working on securing the memory layer of LLM agents and wanted to share what I've found and built.
The problem: When agents persist memory across sessions (RAG indexes, vector stores, conversation logs), that memory becomes an attack surface. An attacker — or even a malicious document the agent ingests — can plant text that overrides system instructions, exfiltrates data via tool calls, or hijacks agent behavior persistently. This is different from prompt injection because the payload lives in stored memory, not the current input. It survives session restarts.
I built a middleware library to address this — it sits between the agent and its memory store and screens every read/write. It uses SHA-256 integrity baselines, pattern-based threat detection, and YAML-defined policies. No external API calls, runs locally, sub-100μs latency.
It was recently adopted by OWASP as the reference implementation for ASI06 (Memory Poisoning) in the OWASP Top 10 for Agentic Applications.
GitHub: https://github.com/OWASP/www-project-agent-memory-guard
Curious how others are thinking about this problem. Are you doing any validation on what goes into your agent's long-term memory?