u/Basic_Bat_5139

Early on i'd pitch testing as part of the engagement, client would push back, i'd fold. ship fast, minimal QA, everyone's happy. seemed fine.

The pattern that showed up over time 3 to 6 months after launch, something breaks. not always something major. usually just a flow that's been broken since day one that nobody caught because nobody was systematically checking anything. reviews start dropping. Users churn quietly. Then i get a call.

and the call is very different from the original conversation. i'm not pitching anything. they're calling me. the pain is real and concrete and they've done the math on what fixing it is going to cost vs what it would have cost to catch it earlier.

by that point it's usually 3x more expensive to fix than it would have been to catch pre-launch. technical debt compounded, the reviews are already written, the users who left didn't bother to tell you why.

i've stopped arguing about testing budget now. when a client pushes back i say "yeah let's see how it goes and revisit." they always come back. i don't have to convince anyone of anything because eventually reality does it for me.

drizz is just part of every engagement now. stopped thinking of it as a line item i have to defend. it's the same as code review you just do it.

reddit.com
u/Basic_Bat_5139 — 12 days ago

We were on a time and materials contract. client wanted a new feature, we built it. somewhere in the process we introduced a regression in an unrelated flow, the kind of thing that's invisible unless you're specifically testing that path. client found it a week or two after the feature shipped. we fixed it. we billed the hours.

We had no idea. we thought we were billing for legitimate bug fixing work that had just turned up. it wasn't until months later when i was going back through some old ticket notes that i pieced together the timeline and realized we had probably caused it.

Nobody complained. the client was happy with us. but it sat wrong because they had absolutely no way to know what we broke vs what was already there vs what they introduced themselves. They just trusted us to be honest about it and we accidentally weren't.

Two things changed. we started doing before/after comparisons on every change run the main flows before we start work and again when we're done, so regressions show up as ours not theirs. we use drizz for this, makes the diff pretty hard to argue with and we started separating bug fix hours in invoices. Anything that was our mistake we eat. no questions.

we've lost money on some jobs since doing this. but client referrals have gone up a lot. apparently "they fix their own screw ups for free" travels faster than i would have expected.

reddit.com
u/Basic_Bat_5139 — 12 days ago

this is a little embarrassing to write but whatever.

We had a testing pipeline. it ran on every PR, flagged things, sent a slack message. for the first couple months the team actually looked at it. then slowly, without anyone deciding to do it, we stopped.

The reason is sort of obvious in retrospect. the pipeline was always flagging the same 6 or 7 known issues, stuff we'd consciously decided to live with until we had time to address it. so every build came back as "failed" and we'd all just scroll past it. failed became the default state.

One sprint we shipped a regression that broke the save flow on mid-range android. nothing we were working on was even close to the save system. something changed in how we were handling background processes and it only triggered on devices where the OS was aggressive about memory which is exactly the kind of device that a chunk of our actual players use.

Our pipeline probably flagged it. Genuinely don't know because no one was looking.

The fix we found wasn't adding more tests or writing better ones. It was making the output impossible to tune out. we switched to something that generates a short screen recording of what it actually ran after each build, We use drizz for this now. you cannot scroll past a 20 second video of your save flow silently failing. You just can't.

The number of tests didn't change. what changed was that the results were now in a format that felt real

reddit.com
u/Basic_Bat_5139 — 12 days ago

not reading their product. their reviews.

every 1-star review is a free user research session that someone else's customer paid for. you get the exact words frustrated people use, the exact moment the thing broke, and the emotional state they were in when they sat down to write a public complaint about it.

my main competitor has a filter feature. users keep calling it "broken" in reviews. it's not broken, it works fine, it just resets when you background the app. users don't know that. they just know they spent 3 minutes setting up filters and the next morning they're gone.

we built our version of that feature last quarter and state persistence was a non-negotiable requirement from day one. wrote three lines of marketing copy around it. "filters that actually stick." that copy came directly from reading their reviews, not from any user research we ran ourselves.

the other pattern i watch for: bugs that survive multiple releases. same crash report, different update, different users, same underlying problem. if something makes it through three release cycles it's either genuinely hard to fix or nobody senior enough has personally hit it yet.

we run flows through drizz before every release now partly because of this. watching your app from the outside through someone else's users' frustration changes how you think about what "working" means. your own internal testing is too warm. everyone's being generous because they built the thing.

anyway. competitor reviews, weekly, 30 minutes. it's the highest signal free research i've found and i've never seen anyone talk about it.

reddit.com
u/Basic_Bat_5139 — 12 days ago

not reading their product. their reviews.

every 1-star review is a free user research session that someone else's customer paid for. you get the exact words frustrated people use, the exact moment the thing broke, and the emotional state they were in when they sat down to write a public complaint about it.

my main competitor has a filter feature. users keep calling it "broken" in reviews. it's not broken, it works fine, it just resets when you background the app. users don't know that. they just know they spent 3 minutes setting up filters and the next morning they're gone.

we built our version of that feature last quarter and state persistence was a non-negotiable requirement from day one. wrote three lines of marketing copy around it. "filters that actually stick." that copy came directly from reading their reviews, not from any user research we ran ourselves.

the other pattern i watch for: bugs that survive multiple releases. same crash report, different update, different users, same underlying problem. if something makes it through three release cycles it's either genuinely hard to fix or nobody senior enough has personally hit it yet.

we run flows through drizz before every release now partly because of this. watching your app from the outside through someone else's users' frustration changes how you think about what "working" means. your own internal testing is too warm. everyone's being generous because they built the thing.

anyway. competitor reviews, weekly, 30 minutes. it's the highest signal free research i've found and i've never seen anyone talk about it.

reddit.com
u/Basic_Bat_5139 — 12 days ago

so this happened about 4 months ago and i'm still a little bitter about it.

100% pass rate on our test suite the morning we deployed. i checked twice because i was feeling good about the release. deployed at noon. by 2pm we had 40 support tickets and our product manager was texting me.

the thing is the tests weren't even wrong. they were just testing in conditions that don't reflect how anyone actually uses the app. everything ran on a Pixel 6 emulator, wifi, english locale, software keyboard closed. clean little controlled environment.

the bug was a keyboard overlapping the confirm button on the checkout screen. on Samsung devices. One UI does keyboard insets slightly differently and our layout wasn't compensating. we had three samsung users on the team. none of them caught it because when you're testing your own app you muscle memory through it and you're not looking for subtle layout shifts.

honestly the bug itself wasn't even that bad. it was patched in like 2 hours. what messed with me was how confident i was that morning. i had looked at that green dashboard and felt genuinely good about shipping.

we run flows across device profiles now before any release. started using drizz for this specifically, it catches the "this breaks on hardware we didn't test on" stuff that emulators miss. but the lesson i can't shake is that passing tests just means the things you thought to test are working. it says nothing about the things you didn't think to test.

reddit.com
u/Basic_Bat_5139 — 12 days ago