Building a companion-like app that uses vector db which can suggest undiscovered mental illnesses using patterns across conversations
I've been working on an AI-powered chatbot (personal project) that acts more like a long-term companion than a normal assistant. One of the goals is being able to notice patterns across conversations over time, things like recurring anxiety spirals, depressive language patterns, social withdrawal signals, mood shifts, etc. Not diagnosing people obviously, but it's like surfacing behavioral patterns carefully and safely.
What I underestimated was how confusing the retrieval layer got once conversations become emotional and deeply contextual.
A few problems I’ve run into:
semantic similarity breaks down badly when users talk indirectly. Someone saying “I’m tired” could mean physically tired, emotionally exhausted, depressed, burnt out, or just sleep deprived. Embeddings cluster these together in weird ways.
storing every conversation chunk destroys retrieval quality over time. The DB slowly fills with emotionally similar but contextually useless memories and retrieval starts surfacing the wrong emotional moments.
recency vs importance is difficult. Some memories from 4 months ago matter more than something said yesterday. Simple vector similarity doesn’t really capture emotional significance.
summarization causes personality drift. If you repeatedly compress memories into summaries, the bot slowly starts remembering an interpreted version of the user instead of the actual user.
pattern detection gets dangerous fast. There’s a huge difference between “this resembles an anxiety pattern” and accidentally over-pathologizing normal behavior.
Currently experimenting with hybrid retrieval:
vector search
metadata filtering
emotional weighting
decayed long-term memory
Still doesn't feel that strong, is anyone else building companion-style AI, therapy-adjacent systems? If yes, have you faced these issues as well and how did you solve them?
A few things I’d love feedback on:
how are you deciding what deserves long-term memory vs temporary context?
are you re-embedding summaries or storing raw conversational moments separately?
how are you preventing retrieval loops where the model keeps reinforcing the same interpretation of the user?
what vector DBs are people actually happy with at scale for conversational memory?
Also, my app does not "diagnose" people directly. It just gives a suggestion and user discretion is always recommended since it's an AI that's doing the judging.