u/Careful-Locksmith153 — reddlx

We're trying to measure where brands actually stand in LLM responses, and the more we dig into it the more I think most people approaching this problem are doing it wrong. Including us, for a while.

The core issue: an LLM is not a search engine. There's no index, no ranking, no defined mechanism by which a brand appears or doesn't. What the model says depends on who's asking, how they're asking, where they are in a decision, what emotional state that implies. The same brand can be front and centre in one conversation and completely absent in the next without anything objectively changing except the phrasing of the question. And that's just one model. ChatGPT, Gemini, Claude, Perplexity share no common knowledge base, no common selection logic, no common source understanding. You're not looking into one black box. You're looking into several that don't know each other exist.

So when people talk about running thousands of prompts against a model and clustering the outputs — I get the instinct, but it's descriptive statistics without a reference frame. You're measuring something. You just don't know what.

The GEO discourse is pointing at the right problem. But most implementations I've seen skip the prior question: under what conditions does visibility arise at all? For whom, in what situation, with what intent.

Our answer was to build from the user side rather than the prompt side. Not average users — specific ones. Someone researching home energy retrofits because heating bills have become unmanageable asks differently than someone considering the same renovation for resale value. Someone evaluating an EV out of conviction behaves differently than someone running the numbers to see if it makes financial sense. Same topic, different motive, different tone, different tolerance for vague answers. The model responds with different recommendations, different sources, different framing. That variation is the data, not noise.

The hard part — and I'll be upfront about this — is building those user models from actual data rather than intuition. The relevant data exists but it's fragmented. Census data tells you population structures. Tracking studies tell you how specific groups seek information. Segmentation models describe values and lifestyles. But no dataset connects household structure, digital literacy, political disposition and emotional state into a coherent profile. You have to model that connection yourself, which means it's an informed approximation. We're not pretending otherwise.

What we haven't figured out is how to make those connections more methodologically robust. Whether latent class analysis, synthetic populations or something else entirely gets you closer to something defensible — genuinely not sure. If anyone has thought about this seriously I'd be interested in what direction you'd go.