I keep getting shocked by how bad the reasoning of Opus 4.7 is. It still seems fine for programming tasks, but when I ask it to advise me about things, it often produces illogical, nonsensical and flatout wrong responses and shows that it didn't understand simple concepts we had just discussed in the conversation.
It is so much worse than previous models that I'm wondering whether we might be starting to see signs of model collapse: this term refers to more and more content on the internet being AI generated and how problematic it is to use such content as training data for new models.
And it's not easy to filter out AI content. We all know how unreliable AI detectors are, so the more AI content is on the internet the more our training data becomes "infected". Have we reached peak LLM performance and are degrading from here?