u/Sweaty-Stop6057

One thing that has consistently surprised me across different companies is how strong postcode features tend to be in models.

At first glance, it's surprising that it's so predictive (it's "just geography facts"), but then it clicks: people tend to live in areas with somewhat likeminded people, and the (visible) area-level behaviours often correlate well with the individual behaviours that we're interested in.

The features that are captured for each postcode,

demographics
deprivation
housing characteristics
crime exposure
transport access
general behaviour patterns

are proxies for behaviours that are hard to observe directly: renewal propensities, fraud, risk.

The other issue is that postcode data is rarely "done properly". It's often:

built once and never updated
very incomplete
or treated as a static lookup rather than something that evolves over time

Of course, there are important considerations around fairness and bias here, since geographic features can correlate with socio-economic factors. In practice, how these features are used depends heavily on the application and regulatory context.

Curious how others are handling this -- do you tend to use postcode features, or is it something that gets deprioritised?

Do you ever blank on objections mid-call?

Postcode is one of the most underrated features in modelling