u/Glittering-Pie6039

I'm generating PNG images server-side (using a headless rendering library) and serving them through a separate API route. The render route writes the PNGs to os.tmpdir() under a batch ID, and the image route reads them back from the same path. Worked perfectly in local dev. Deployed to Vercel, started getting 404s on maybe 30-40% of image loads.

The stack trace looked alarming. The browser console showed what seemed like an infinite re-render loop, hundreds of repeated React reconciliation calls. That sent me down the wrong path for a while, thinking there was a client side state issue causing a retry spiral. Turned out that was just the browser console dumping the full React component tree for a failed <img> load. Not an actual loop.

The real cause is straightforward once you know it. Vercel serverless functions run on ephemeral containers. The render route writes files to /tmp/carousel/{batchId}/. The image route tries to read from the same path. But these are separate function invocations that can hit different containers. Container A writes the files. Container B gets the image request. Container B's /tmp is empty. 404.

It works sometimes because warm containers can serve both requests. It fails when the image request lands on a different container than the one that rendered. That's why it's intermittent, which is also why it took me longer than it should have to diagnose. Intermittent bugs with no error logs on the server side (just a clean 404) don't give you much to work with.

I also had a zip download route that read from the same /tmp path to bundle all images. Same bug, I just hadn't noticed it yet because I'd been testing downloads immediately after rendering (same warm container).

The fix was to stop using the filesystem entirely. The render route now returns base64 data URLs inline in the response instead of writing to disk. The client gets the image data directly. <img src> handles data URLs the same as regular URLs, so the component barely changed. For the zip download, I moved to client-side generation with JSZip. The browser has the base64 data already, so it just packs the zip locally instead of making another server round-trip.

This eliminated the image route and the download route completely. Both were dead code after the change. Fewer API routes, no filesystem dependency, and it's actually faster because there's no second HTTP request per image.

If you're generating files server-side on Vercel and serving them through a separate route, you'll hit this eventually. The /tmp directory is not shared across invocations. In-memory or inline responses are the way to go for anything that needs to survive between the write and the read.

TL;DR: Vercel serverless containers don't share /tmp. Writing files in one API route and reading them in another will 404 intermittently when requests hit different containers. Fix: return base64 data URLs inline instead of writing to disk. Eliminated two API routes and the filesystem dependency entirely.

I'm building a tool that runs two LLM passes in series. The first generates structured content. The second validates it against a constraint set and rewrites violations. The validation prompt explicitly says: return ONLY the corrected text, no commentary, no reasoning.

The model complies about 95% of the time. The other 5%, it outputs things like "Let me check this text for violations..." or "I need to verify the constraints..." before the corrected content. That reasoning gets passed straight through to the parser, which chokes because it's expecting the first line to be a content marker, not a sentence about checking constraints.

The fix is two layers.

Layer 1: Prompt tightening. The validation prompt now explicitly forbids reasoning, preamble, and violation lists. It says the output must start with the first content marker. This reduced the frequency from ~5% to ~1%, but did not eliminate it.

Layer 2: Defensive strip before parsing. A stripValidationPreamble() function runs on every validation output before any parser touches it. For structured formats it anchors to the first recognised marker and throws away everything before it. For plain-text formats it strips lines matching known validator commentary patterns (things like "Let me check this text" or "This violates the constraint").

The strip-before-parse ordering is the key decision. Every downstream parser operates on already-sanitised output. You don't end up maintaining per-field stripping logic or playing whack-a-mole with new reasoning formats.

One thing I had to be careful with: the plain-text strip patterns. A regex that catches "This is a violation" will also catch "This is a common mistake" in legitimate content. I tightened the patterns to only match validator-specific language, things like "This violates the/a rule/constraint" rather than broad matches on "This is" or "This uses." Each pattern needs auditing against real content before you ship it.

If you're parsing structured output from an LLM, I'd treat prompt instructions as a best-effort first pass and always have a code-level defense before the parser. The model will comply 95% of the time. The 5% where it doesn't will break your downstream logic in ways that are hard to reproduce because they're intermittent.

TL;DR: LLM validation passes leak reasoning into structured output despite explicit instructions not to. Prompt tightening reduces frequency but doesn't eliminate it. The fix is a strip function that runs before parsing, anchoring to the first valid content marker and throwing away everything before it. Treat prompt compliance as best-effort, not guaranteed.

Intermittent 404s on dynamically generated images in Next.js on Vercel, turns out /tmp is per-container

LLM validation passes leak reasoning into structured output even when explicitly told not to. Here is the two-layer fix.