
Shipped v0.3.0 of my fitness app with AI photo detection - the demo took a day, the product took weeks. Here's the honest breakdown.
Just pushed v0.3.0 of Better, my solo-dev fitness + nutrition app built with Kotlin Multiplatform (single codebase, Android + iOS). The headline feature: point your camera at a meal, get instant calorie and macro logging.
I want to share the honest build retrospective because the gap between "working prototype" and "shippable product" was bigger than anything I've experienced.
The problem that drove this feature:
Every single beta tester gave me the same feedback: "Workout tracking is great. Nutrition logging is too slow to stick with." Manual food search takes 2-3 minutes per meal. Nobody does that 3 times a day, every day. People would track workouts religiously but abandon nutrition after a week.
Photo detection was the obvious answer. Making it reliable was not.
Timeline vs. expectations:
- Day 1-2: Working prototype. Clear photo of a single food item on a white plate, good lighting. Nailed it. I thought I was 80% done.
- Week 1-3: Reality check. Blurry photos. Dim restaurant lighting. Plates with 6 different items touching each other. Home-cooked Central Asian food that no Western-trained model has ever seen. Street food wrapped in paper. I was maybe 30% done.
- Week 3-5: Portion estimation. This is the genuinely hard part that nobody talks about. Identifying "rice" vs "pasta" is relatively easy. Estimating whether it's 150g or 250g of rice from a 2D image is a different problem entirely.
60% of total dev time went into edge cases that cover about 30% of real-world usage. But that 30% is the difference between a demo you show investors and a product users actually trust.
Architecture decisions (for the technical folks):
- Camera capture is platform-specific (CameraX on Android, AVFoundation on iOS) - this is one of the few places where expect/actual beats interface + DI
- Detection pipeline lives in shared commonMain - both platforms get identical behavior
- On-device processing where possible, server fallback for complex multi-item plates
- State machine: Idle -> Capturing -> Analyzing -> Results -> Error, modeled as sealed interface with a Molecule presenter managing the flow
85% shared code across platforms. KMP genuinely lets you build like a team of 5 while being a team of 1.
Business model context:
Better is $3.99/month. Everything currently in the app stays free forever. Photo detection is included free right now because I want the data flywheel - more photos logged means better detection accuracy over time. I will never paywall existing features. Only genuinely new stuff goes premium.
Revenue: pre-revenue. Still focused on building the product and acquiring early users. Not optimizing for money yet.
What actually matters long-term:
Photo detection isn't the moat. It's the enabler.
Better owns both workout AND nutrition data. When nutrition logging becomes effortless, people actually do it consistently. And consistent data from both domains unlocks something no competitor can build: "Your bench press improved 12% during weeks you hit your protein target."
MyFitnessPal has nutrition. Hevy has workouts. Neither can connect the two. That's the bet.
Biggest lesson from this feature:
The demo is never the product. If you're building anything with AI/ML, multiply your timeline estimate by 3x. The happy path works fast. The edge cases are where you actually earn user trust. And user trust is the only thing that matters for retention.
Question for other builders: What's the hardest "last mile" problem you've hit shipping an AI-powered feature? I'm curious whether the pattern of "60% of time on 30% of cases" is universal or if I just scoped poorly.
Google Play: https://play.google.com/store/apps/details?id=io.behzodhalil.better