The reason some AI assistants feel smart and others feel dumb has nothing to do with the model
There's a framing that dominates almost every AI evaluation I've seen: which model is powering it?
GPT-5? Claude? Gemini? The implicit assumption is that smarter model = better product.
I think this is mostly wrong, and it's leading teams to optimize the wrong thing.
The frontier models available today are, for most practical purposes, comparable. They're all extraordinarily capable. The variance in user experience between products isn't primarily driven by which model sits underneath.
What actually determines whether an AI assistant feels intelligent — whether it gets better over time, personalizes meaningfully, earns user trust — is whether it has memory.
Not in a vague sense. Concretely: does the agent retain structured context across sessions? Does it remember your preferences without being reminded every time? Can it reference what you discussed three weeks ago?
An agent with no memory treats every user as a stranger on every visit. The best model in the world, configured this way, will feel worse than a less capable model that actually knows who you're talking to.
Three things worth building memory around:
- Preferences and style — how the user likes to communicate, what format they want, what to avoid
- History and context — what they've worked on, what's been decided, what's been tried
- Goals and constraints — what they're actually trying to accomplish and what limits them
When all three are present, "which model are you using?" becomes a secondary question.
Curious if others have noticed this in practice — whether the memory architecture of a tool has meaningfully affected your experience with it more than the underlying model.