
Context Is Not Control: Source-Boundary Failures in Controlled Text-Mediated Evidence Use.
Ok. The raw dawg researcher is back!
This time I’ve released a working paper + replication artifacts on source-boundary failures in LLM evidence use.
The claim is basically that language models can treat text that's merely present in the context window as answer-bearing evidence, even when that text is not admissible to the task.
This paper's benchmark is specifically about whether models preserve the distinction between
* context
* admissible source
* injected/contaminating text
* instruction
* answer-shaped but unsupported content
The release includes working manuscript, open-weight replication package, frontier/API replication package, GitHub repo, Zenodo, DOl archive.
The strongest result, in plain English, is that giving models an "INSUFFICIENT" output option was not enough. Recovery appeared when the task frame explicitly represented source admissibility / source boundaries.
I'd be especially interested in critique around experimental design, my scoring choices, what the strongest confound or missing ablation might be. I appreciate any feedback.
[Repo](https://github.com/rjsabouhi/context-is-
not-control)
[Paper + Reproduction](https://zenodo.org/records/
20126173)