u/AlexZan

▲ 222 r/ClaudeAI

If you've been wondering why your Max plan exhausts faster than it should, you're not crazy and it's not your imagination.

I asked a Claude Opus 4.7 agent to investigate its own token usage. After 8 turns it had been billed for 127K tokens for ~25K of unique content. It noticed the discrepancy and started reading
its own session logs. It surfaced GitHub issues going back to mid-December 2025, two reverse-engineered bugs in the Claude Code binary, and a community-written patch the company hasn't shipped.

The tl;dr:

  • Bug A — billing-word substitution in the binary trips on common terminology and forces a full uncached rebuild every turn (10-20× cost impact)
  • Bug Bclaude --resume and --continue invalidate the cache the moment you resume, paying full freight on the first turn
  • Telemetry coupling — disabling telemetry silently disables the 1-hour cache TTL (privacy users get penalized)
  • Peak-hour throttle — Anthropic confirmed only after press contact; never published the magnitude
  • None of the cache bugs are acknowledged in any Anthropic release note despite six weeks of acute reports

The data needed to detect this is already on your machine — Anthropic just doesn't surface it in the UI. I built a 50-line statusline tool that reads the same JSONL Claude Code already writes
locally and shows your per-turn cache hit rate in real time. My book-writing chat had 128 cache flush events when I deployed it.

Tool: https://github.com/AlexZan/cc-cache-monitor

Full writeup with timeline + sources: https://medium.com/@alexzanfir/claude-diagnosed-its-own-cache-bug-a-six-month-timeline-332f577e1fe9

Mitigations until Anthropic ships a fix:

  • Avoid the GMT peak window (1pm-7pm GMT / 5am-11am PT weekdays)
  • Don't use --resume or --continue
  • One Claude Code session at a time during dense work
  • Don't disable telemetry (counterintuitive but real)
  • Run cc-cache-monitor in your statusline so you see the bug fire in real time

I'm explicitly not recommending "switch to Sonnet" — if you paid for Opus, you paid for Opus. "Use a worse model" subsidizes the broken state. The article goes deeper into why.

u/AlexZan — 9 days ago