
My take on a Context layer for your coding agents
Context Layer for AI Coding Agents
A codebase is mostly drift. The rest was decided in a Claude session that isn't captured. Only the engineer who built it knows which is which, and that map exists nowhere else.
Decisions, trade-offs, dead ends, and the calls Claude makes for you that creep into the code unannounced. None of it is captured beyond the session. Code is the final output of context, not the context itself. CLAUDE.md helps a little but it’s a static page. The actual reasoning lives in 50 jsonl files in your local Claude folder, and nobody reads them.
Here’s an honest attempt at building the context layer that holds it.
Components, concepts, flows.
Everything about a project fits in three categories. Components are code-anchored: a service, a module, a table. Concepts are cross-cutting ideas: auth model, event sourcing, multi-turn context. Flows are sequential processes within concepts: user message to response, OAuth handoff. These are MECE, and how I internally map a project.
The atom is a claim.
Below the three categories, everything is a claim. One atomic fact, immutable once written. When a claim becomes wrong, you don’t edit it. You write a new claim with a forward-pointing edge. The old claim stays. History never gets rewritten. Claims can be linked to any of the components, concepts or flows. Rename a database table from users -> accounts. Claims tagged with the old name still resolve when you query the new one, because the rename generates new data.
Observations → claims, eventually
Capture shouldn't write directly to the graph. Not every notice is a claim. Every fact starts as an observation, with an inference type (saw it stated, inferred from code, inferred from what’s missing). Similar observations across different sessions cluster together. One off things are ignored later. Multiple sessions reinforcing the same idea should promote the observation to a claim. This is what stops every statement from becoming important.
Drift detection by absence.
Some claims aren’t just true. They’re true AND nobody decided them. There’s no retry on the payment endpoint. Why no fallback? Nobody decided not to add one. Drift is its own flavor of claim. The agent flags it when it sees a rule with no decision-event nearby. This is a real category in codebases.
Querying the graph
Agents gets a snapshot, a regeneratable folder of markdown organized hierarchically by tag, that it can grep like any other folder. Claims are scoped, because the same project behaves differently in different contexts. A claim true on main may not be true on a feature branch. A flow may be live for enterprise customers and stubbed for everyone else. Queries take a context (branch, region, flag state, customer tier etc) and return the snapshot that holds under it. Cross-scope queries work too: what’s different between main and this branch? What’s true for enterprise that isn’t true for free tier? Senior engineers carry this kind of mental map already. The layer makes it queryable.
Confidence decays.
A claim that was true once may not be true now. Confidence is not a label you set, it’s a function of how many direct observations support a claim, where they came from, and how recently. Rules might expire in 90 days, constraints in 30, since it is wrong to assume that ALL the context about a system is feeding into this layer. When the time window passes without a fresh observation, the claim surfaces for re-verification. The truth is in the code, the memory shouldn't become stale. Rotten claims will make the layer unusable.
TLDR:
Capture the decisions, the trade-offs, and the drift that happens during your Claude Code sessions. Structure it. Scope it. Make it queryable. Give your AI agent the same mental map of the project that the engineer who built it carries in their head.