u/Altruistic_Risk9994

Hey everyone,

For the last three months, I’ve been engineering a system called Lumen. It’s an Android application for passive behavioral anomaly detection to screen for early mental health risks (depression/anxiety onset).

The core problem with current digital phenotyping is that it uses population-level machine learning. But what looks like depression for an extrovert might be a normal baseline for an introvert. Lumen fixes this by taking an idiographic (within-person) approach. It learns your personal baseline over 28 days and flags sustained deviations.

The Engineering & Architecture: I wanted to build a practical background-running mobile utility, not a lab-only prototype.

Privacy-by-Design: Processes 29 behavioral features entirely on-device. No raw data leaves the phone.
Doze Mode Resilient: Engineered to survive aggressive OEM process killers by deferring heavy L1/L2 processing to overnight charging windows.
The Math: Uses a dual-layer architecture. Layer 1 uses clinically-weighted z-scores and EWMA velocity for aggregate shifts. Layer 2 maps "AppDNA" (abandon rates, KL divergence-based rhythm dissolution) to measure the texture of phone usage, not just screen time. An evidence engine (inspired by Statistical Process Control) ensures only multi-day, sustained shifts trigger alerts.

The Problem I'm Hitting: The mathematical architecture is mature and works perfectly in 180-day synthetic simulations. However, validating a longitudinal, within-person tool using existing cross-sectional datasets (like the StudentLife dataset) inherently misrepresents the system's capabilities.

I have a mature systems engineering prototype, but it is not yet a clinically validated tool.

We made an application for early screening of Mental Health