There are signs Anthropic may be losing momentum, and I don’t think this is just a normal “cycle” issue. It looks more like they’re hitting limits before OpenAI.
Both companies are likely subsidizing usage (at least that’s what they claim), but OpenAI has more capital and compute, so it can absorb pressure better. It also seems to have fewer extremely heavy users compared to Anthropic’s ecosystem (e.g., Claude Code).
The real question is how each company responds to that pressure.
Anthropic’s approach appears to be silent degradation: optimizing models in ways that reduce cost, while also restricting usage (like limiting third-party harnesses). Opus 4.5 felt like a peak. Then 4.6 became more capable, but also more constrained due to these optimizations. The end result was arguably still better—especially with the 1M context—but you could already see the trade-offs.
With 4.7, the intelligence improved again, but the optimization push seems too aggressive. The model feels overly steered. I’m not an expert, but it likely relates to post-training choices. Combine that with Claude Code being increasingly tuned to constrain and optimize usage, and the overall UX starts to degrade.
On OpenAI’s side, 5.4 (xhigh) feels relatively unconstrained. But 5.5 shows signs of similar optimization pressure: it can be less thorough and tends to end tasks earlier. It’s not as pronounced, but it resembles the same “lazy” signature people associate with Claude.
Meanwhile, Chinese models are catching up fast. They don’t need to be frontier-level—just “good enough” at a much lower cost. I’ve been testing OpenCode Go with Kimi 2.5/2.6, DeepSeek v4 Max, and GLM 5.1 (though that one burns usage quickly). They’re not on par with models like Codex 5.3–5.5, but the gap has narrowed a lot.
At this point, a hybrid workflow already works: use cheaper models for most tasks, and rely on Claude or Codex as reviewers or for harder logic. You get acceptable results at a significantly lower cost.
Overall, these companies are walking a tightrope—trying to balance performance, cost, and user satisfaction. How they handle that trade-off will shape how this evolves.
What’s your take — is Anthropic actually hitting limits here, or is this just normal model iteration cycle?