u/Electronic-Salad9608

▲ 3 r/PromptEngineering+1 crossposts

I've been working on securing the memory layer of LLM agents and wanted to share what I've found and built.

The problem: When agents persist memory across sessions (RAG indexes, vector stores, conversation logs), that memory becomes an attack surface. An attacker — or even a malicious document the agent ingests — can plant text that overrides system instructions, exfiltrates data via tool calls, or hijacks agent behavior persistently. This is different from prompt injection because the payload lives in stored memory, not the current input. It survives session restarts.

I built a middleware library to address this — it sits between the agent and its memory store and screens every read/write. It uses SHA-256 integrity baselines, pattern-based threat detection, and YAML-defined policies. No external API calls, runs locally, sub-100μs latency.

It was recently adopted by OWASP as the reference implementation for ASI06 (Memory Poisoning) in the OWASP Top 10 for Agentic Applications.

GitHub: https://github.com/OWASP/www-project-agent-memory-guard

Curious how others are thinking about this problem. Are you doing any validation on what goes into your agent's long-term memory?
u/Electronic-Salad9608 — 8 days ago

As AI agents become more autonomous and persist memory across sessions (RAG indexes, conversation history, vector stores), there's a growing attack surface that most people aren't thinking about: memory poisoning.An attacker can plant malicious text into an agent's memory that overrides instructions, exfiltrates data, or hijacks tool calls — and the attack persists because the memory does. It's not a one-shot prompt injection; it's a persistent backdoor.I've been working on OWASP Agent Memory Guard — the official reference implementation for ASI06 (Memory Poisoning) from the OWASP Top 10 for Agentic Applications. It sits between the agent and its memory store, screening every read/write through:- SHA-256 integrity baselines- Built-in threat detectors (prompt injection, PII leakage, key tampering)- YAML-defined policy enforcement- Sub-100μs latency, zero external dependenciesIt hooks into before_model, after_model, and wrap_tool_call in the agent loop. Three violation modes: block, warn, strip.Currently has integrations for LangChain with more coming. Would love feedback from anyone building production agents — especially failure cases where memory got corrupted or manipulated.What approaches are you all using to protect agent memory today?

reddit.com
u/Electronic-Salad9608 — 8 days ago

I've been working on the OWASP reference implementation for memory poisoning defense (ASI06 in the new Top 10 for Agentic Applications) and the LangChain integration is now shipping.

```python
from langchain_agent_memory_guard import MemoryGuardMiddleware
from agent_memory_guard import Policy

middleware = MemoryGuardMiddleware(policy=Policy.strict())
```

Three hooks into the LangChain agent loop:

- `before_model`: scans messages in agent state (catches injection in memory/context)
- `after_model`: scans model response (catches secret leakage / propagation)
- `wrap_tool_call`: scans tool output (the primary attack vector — adversarial content from tools)

Three violation modes: block, warn, strip. YAML-defined policies, no API keys, no external dependencies.

Benchmark on 55 attack payloads: 92.5% detection, 100% precision, 0% FPs, 59µs median latency.

Docs PR is in flight at langchain-ai/docs#3846. Until that merges, source is at:
https://github.com/OWASP/www-project-agent-memory-guard/tree/main/integrations/langchain-agent-memory-guard

Apache-2.0. Feedback welcome — especially failure cases on real production agent setups.

reddit.com
u/Electronic-Salad9608 — 11 days ago