r/AITestingtooldrizz

▲ 10 r/AITestingtooldrizz+2 crossposts

I almost ignored a huge user signal.

I built an AI tool for games. It was working. Users were active. I was heads down improving it.

But I kept seeing characters and animations that didn't belong.
• A sweaty broccoli floret for a workout app.
• A neon octopus for a multi-agent AI tool.
• A tiny, armored armadillo for a password manager.

These weren't game makers. These were app builders.

They were hacking my game tool for something completely different. They wanted their brands to stand out and their users to feel something.

My first instinct? That’s not what this is for.
But the signal didn't stop. So I leaned in. Talked to them. Understood the problem.
No one could find a fast, affordable way to get a professional animated mascot.

So I also launched what they asked for.

u/missEves — 21 hours ago
🔥 Hot ▲ 189 r/AITestingtooldrizz+1 crossposts

How I say no to a client request without losing the relationship (Tutorial)

I am a founder myself and saying no to a customer is uncomfortable every single time even when you are completely sure it is the right call.

What worked for us after getting it wrong a few times is one question, is this person describing a problem only they have or a problem a large chunk of our users share if it is only them we pass, if it's bigger than them it goes into the actual roadmap conversation.

How you say it matters more than the decision, we never push back on the request itself but we do push back on the specific solution they suggested while making it obvious we actually understand what is frustrating them, those are two different things and people react to them completely differently.

something like "we get that X is slowing your team down we are not going to build Y but here is how we are thinking about solving X and roughly when" is different than "that is not something we are working on right now" and the customers who walked away after a no were almost never leaving because of the no, the product just was not the right fit and the feature request was the first honest signal of that, when you say no clearly and someone stays, the relationship gets more solid because they know your yes is not just you avoiding an awkward conversation

u/Same_Technology_6491 — 3 days ago

The QA role is splitting into two

Job descriptions for QA engineers in 2026 feels like they were written 5 years ago and the gap between what those descriptions say and what the role actually requires is getting wider every 3 or 4 months.

What's actually happening is the role is splitting, one side is writing test infrastructure, building automation frameworks, working inside CI/CD pipelines, understanding distributed systems, closer to a software engineer, the other side is becoming more strategic, closer to product, focused on risk assessment, defining what needs to be tested and why, understanding user behavior and where it diverges from how the system was designed.

Both are legitimate and are valuable but they require completely different skills and almost no company is hiring for them as separate roles yet, they are still writing one job description that asks for both and then wondering why the person they hired is strong in one area and struggling in the other, the industry will catch up eventually but right now there are a lot of QA engineers doing two jobs under one title and getting paid for neither properly.

reddit.com
u/Same_Technology_6491 — 3 hours ago
▲ 31 r/AITestingtooldrizz+1 crossposts

i work in testing and my team replaced genuine testing instinct with AI tooling.

i'm a QA engineer at a corporate setup. one tester, multiple markets, and a pipeline that needs proof of passing runs across three execution platforms before anything merges. sometime around february the team decided the repetitive overhead was too high and brought in AI tooling (drizz, testim, copilot) to absorb it

on the surface it worked exactly as advertised. regression coverage went up. sprint velocity improved. the number of automated test cases in the suite nearly doubled in two months. management looked at the dashboard and saw green

what the dashboard doesn't show is that nobody fully understands what half those tests are actually verifying anymore

the assertions were generated fast. the flows were mapped by tooling that has no concept of what the product is supposed to do for a real user. tests were written against implementation detail instead of behaviour because the AI had no way of knowing the difference and nobody slowed down long enough to catch it. the suite grew and the collective comprehension of what the suite meant quietly shrank in the opposite direction

the junior testers who came in after the tooling was already in place have almost no debugging instinct. they can prompt. they cannot tell you why a flaky test is flaky or what an assertion being too tightly coupled to internal state actually means for regression confidence. that understanding is supposed to come from writing tests badly first and learning from it. the tooling skipped that entire phase and called it efficiency

when something fails in production now the investigation takes longer than it used to. not because the bugs are more complex but because the test that should have caught it was generated by something that approximated coverage without understanding the risk surface

the velocity numbers are real. the sprint metrics are green. and i genuinely cannot tell you with confidence whether the next release is safe to ship or whether we have just built an elaborate system for feeling like we can

that gap between appearance and reality is the part nobody is measuring and nobody wants to talk about because the dashboard looks fine

reddit.com
u/corporate925 — 3 days ago

Shadow DOM is going to make me quit QA entirely

I am so tired.

We had a huge engineering push to build an internal component library using native Web Components. Architecturally, the devs love it. For me? Every single standard input, dropdown, and button is now encapsulated inside a Shadow Root.

Traditional locators literally bounce off the Shadow DOM. To interact with a simple text field that I can clearly see with my own two eyes on my monitor, I have to write deep traversal scripts, piercing through multiple shadow boundaries just to dispatch a keyboard event. I spent three hours today debugging a failing script, only to realize a dev wrapped a button in a new web component and hid it from the light DOM.

It feels completely backwards. I am fighting the architecture of the application just to verify that a button clicks.

Has anyone successfully detached their testing from the DOM tree entirely? I just want my test to look at the screen and click the button without needing a map of how the engineers packaged the code.

reddit.com
u/Maxl-2453 — 1 day ago

Found a simple way to manage QA without messy tools

Qualityfolio, that tries to bring QA directly into the development workflow. Instead of external tools, it uses Markdown in the repo for tests, CI for execution, and generates dashboards from actual results.

If you have a few moments, I would really appreciate your thoughts.
https://qualityfolio.dev/

GitHub: https://github.com/opsfolio/Qualityfolio

We are looking for honest feedback from fellow QA professionals, any input from you would be hugely helpful. Thanks so much! 🙂

reddit.com
u/Background-Donkey531 — 7 hours ago

PSA: If marketing has access to Google Tag Manager, your automated tests are already dead.

Consider this a warning to anyone setting up a new E2E suite.

Stage 1: False Hope You build a beautiful suite of 150 critical user journey tests. They run perfectly in your local environment. You feel like a god.

Stage 2: The Ambush Your marketing team decides they want to run a weekend promo. They inject a massive newsletter popup via GTM that loads asynchronously, usually about 3 seconds after the page renders.

Stage 3: The Slot Machine Pipeline Your tests now randomly pass or fail based on server speed. If the script clicks the checkout button in 2.5 seconds, it passes. If the server is slow and the popup loads first, it intercepts the click and the pipeline crashes.

I don't want to wrap every single .click() command in my entire framework in a massive try/catch block just to look for a random modal. Why is it so incredibly hard to get a test framework to just act like a normal human being and dismiss an overlay if it sees one? How do you guys handle unexpected third-party scripts without writing incredibly ugly test code?

reddit.com
u/kayanokoji02 — 1 day ago

My 100% green Playwright suite just let a critical UI bug slip into production, and it completely changed how I view E2E testing.

I’m still recovering from a massive post-mortem we had on Monday. I spent the last three months building a rock-solid automation suite for our core checkout flow. Every PR had to pass it, the pipeline was consistently green, and we felt invincible.

Last Thursday, the marketing team pushed a "temporary" sticky promotional banner to the mobile view. The devs merged it, my E2E suite ran, clicked the "Confirm Order" button perfectly, and gave a green light. We deployed.

Friday morning, we realized mobile conversions had flatlined for 12 hours.

Turns out, the new sticky banner had a z-index issue and physically covered the entire checkout button on smaller screens. Real users literally could not tap it. But my script didn't care. It bypassed the visual rendering layer, found the <button> node in the DOM, and fired a click event directly via JavaScript. It gave us total false confidence because it did something a human physically couldn't do.

It made me realize that traditional automation is fundamentally flawed: we aren't testing the user's experience, we are just testing the DOM state.

Valuable Takeaways & Resources I’m looking into:

  • Audit your framework's actionability checks: If you use Playwright, make sure you aren't overusing .click({ force: true }). For Cypress, understand how it checks for visibility. But even then, they can be tricked by CSS transforms.
  • Visual Regression is a bandaid, not a cure: We looked into tools like Percy and BackstopJS, but they just flag pixel differences. I don't want to approve 50 baseline images every time a dev changes a padding value.
  • The Philosophical Gap: We need to start thinking about how to test visual intent rather than code implementation. Has anyone found a reliable way to test what the screen actually looks like and interacts like, without relying on the hidden HTML?
reddit.com
u/dhana231_231 — 1 day ago
▲ 34 r/AITestingtooldrizz+1 crossposts

our first enterprise client almost killed our company

We signed our first enterprise client eight months in, we were confident and the team was excited, we celebrated then the actual work started

enterprise means compliance reviews, security audits, procurement processes, legal redlines on contracts that took three months to close, a dedicated slack channel where requests came in at all hours, custom feature asks that were reasonable individually and impossible collectively, an onboarding process that consumed two of our five engineers for six weeks

we built the product for fast moving mobile teams that wanted to get started in minutes, enterprise wanted everything we didn't have yet, SSO, audit logs, custom data retention, on premise deployment options, SLAs with penalty clauses, a named customer success contact which at our size meant a founder on every call

revenue looked great on paper but the underneath was ugly, velocity dropped, the rest of our pipeline stalled because we had no bandwidth and two smaller customers churned because response times slowed down and we didn't notice fast enough

took us four months to stabilize, we learned more about where drizz actually needed to be in that period than in the six months before it, wouldn't change it but I would have gone in with completely different expectations if I'd known what was coming

edit: yes our product is an ai agent and I'm writing this just so other founders contemplate before signing any client

reddit.com
u/Same_Technology_6491 — 6 days ago

Dev velocity has 5x'd this year. My testing velocity hasn't.

This is more of an open discussion, but is anyone else feeling completely left behind by the speed frontend devs are moving at right now?

Since our team adopted Copilot and Cursor, features that used to take them three days are being knocked out in an afternoon. They are shipping insane amounts of UI code into staging.

The issue is that writing robust automation scripts didn't get faster. And worse, the AI-generated code they are pushing is often super messy under the hood, weird wrapper divs, inconsistent naming, etc. So my traditional DOM-based scripts are breaking constantly trying to hook into it.

Management is starting to look at me like I'm the bottleneck. I physically cannot map out the DOM and write locator-based tests at the speed a machine generates the front-end code. Are you guys just accepting lower test coverage, or is there a completely different way to approach this that I'm missing?

reddit.com
u/Afraid-Bobcat6676 — 1 day ago
▲ 7 r/AITestingtooldrizz+1 crossposts

tracked 3 months of my own PR failures. the test suite is blocking me in ways nobody else can see

around january my commit pace started dropping. not because features got harder but instead i was spending more time getting PRs through the gate than actually developing. so i started tracking my past three months, 30 plus PR failures across my own commits. the reason wasn't what i expected

genuine regressions were the minority majority of it split across three patterns… flaky locators tied to DOM attributes that shift between deployments, environment-specific failures from configuration drift between staging and rollout that nobody formally documented, and tests asserting against implementation details rather than behaviour. that last one is the worst. refactored a transformation module in february, cleaner logic, identical output, four tests failed because they were coupled to intermediate state that no longer existed, the feature worked but the suite disagreed

a lot of these tests were written under automation pressure the team needed coverage numbers up, sprint had a TC automation quota, so tests got written fast. no time to think properly about selector strategy, assertion design, or whether the test was actually verifying behaviour versus internal structure the suite grew, the metrics looked healthy, and the underlying fragility got baked in quietly

that's what i've been committing against for three months

the invisibility of it is what actually gets to me. sprint metrics don't capture time spent re-running pipelines or diagnosing flaky failures. from the outside my velocity looked low…. the suite looked green. those two things were directly connected and nobody was looking at that relationship

started logging failure reasons instead of just counts. flaky infrastructure, environment drift, wrong assertion target, genuine regression. each one has a completely different fix and collapsing them all into a single failure metric is how this stays invisible for months

I am not sure what the fix looks like at the team level yet

reddit.com
u/corporate925 — 4 days ago

QA testers needed for whimsy app

Tired of digging through Google Drive, Dropbox, email attachments, and chats just to find one file?

We built Whimsy — a unified file command center that brings everything together across 10+ providers.

🔍 Search all your files in one place
🧠 Automatically organize your existing data so it’s actually findable
🔄 Seamless transfers between cloud providers
🤖 Fetch files directly from Telegram / WhatsApp with simple commands

No more scattered storage. No more “where did I save that?” moments.

We’re opening a closed beta for 50 early users to help shape the product.

If you’re interested, drop a comment in r/Numeracode with your email — we’ll send invites to the first 50. drop in this post Built for people who are tired of chaos and just want their files to work.

Let’s fix file management.

reddit.com
u/No_Beach_3571 — 3 days ago

Free Lightweight alternative to n8n | dev toolkit

Hey folks 👋

I’ve been building a developer tool over the past few months that started as a simple webhook testing tool… and it’s slowly evolving into something much more useful for real-world automation.

Right now it supports:

- Inspecting webhooks (instant endpoint, no signup)

- Creating custom workflows (like n8n)

- Mocking APIs / servers for testing integrations

But here’s why I’m posting here:

I’m looking to work directly with a few e-commerce folks (Shopify, WooCommerce, custom setups, etc.) to help you automate parts of your business personally and for free.

Things like:

- Order → fulfillment workflows

- Payment → notification pipelines

- Inventory sync between services

- Custom webhook-based automations

- Replacing Zapier-type setups with something more flexible

I’m a senior backend engineer, and I’m trying to shape this product based on real use cases, not guesses.

If you have:

- messy automations

- manual processes you hate

- or webhook chaos

Drop a comment or DM me. I’ll help you set it up, and in return I learn what actually matters.

No sales pitch. Just building + helping.

Would love to collaborate 🤝

reddit.com
u/shabbir_hasan — 3 days ago

Just launched my first lightweight SAAS tool on Product hunt!

https://preview.redd.it/4wi6plssaawg1.png?width=1138&format=png&auto=webp&s=24f5d12a6f7ae866b9537f7651d02de757926a1d

Hey Guys,

Hope y'all are well.

I'm a solopreneur after working on 10+ products and shipping none! finally releasing it.

From juggling between multiple tools, first I tried Replit, and then I tried Emergent, and then Floot, and here, I thought I found a platform where I can work on all my ideas, but failed miserably.

So I started to feel that these tools don't help generate production-grade apps or websites.

But Windsurf surprised me, although their recent changes have significantly affected how I work on the tools, but nevertheless, it did help me achieve ship.

Check out the launch here: https://www.producthunt.com/p/cheq/cheq-we-built-a-checklist-app-because-every-simple-to-do-app-felt-overengineered

pre-launch https://www.producthunt.com/products/cheq/cheq/prelaunch

If you're a vibecoder like me, your support would greatly help me!

If possible if you can try and test the app and some feedback would be great!

Thank you

reddit.com
u/Shaggy6469 — 2 days ago
▲ 7 r/AITestingtooldrizz+1 crossposts

you need to know about testing payments on mobile before you go live

I worked at a fintech before drizz and the payment bugs that made it to production were always the same category but they were not the obvious ones and most teams check that the path works, card goes in, payment succeeds, user sees confirmation and they move on but that is maybe 20% of what can actually go wrong

User loses network halfway through a transaction and taps pay again not knowing the first one already went through, device locks during 3DS verification and the session times out, user thinks payment failed, tries again, same problem. keyboard pops up and covers the confirm button on a specific android screen size, user cannot complete the purchase and never tells you why they dropped off

The device stuff is where it gets really specific, budget android phones with 4GB RAM will sometimes drop the payment screen from memory mid flow because the os is aggressively clearing background processes, certain android 12 builds had issues with payment SDKs that nobody caught until users hit it at scale.

We once traced a bug that only appeared on the 28th of every month to a timezone offset in how a billing cycle was calculated, took three weeks to find because nobody thought to test on that specific date and none of this shows up in a standard automated suite because the suite is running in clean controlled conditions and real payments do not happen in clean controlled conditions

reddit.com
u/Same_Technology_6491 — 4 days ago

i'm drowning and my team ain't doing shit

Somehow migrating our frontend to web components was the right call for the product and has made our test suite nearly unusable at the same time,

Piercing Shadow dom to interact with elements that are completely visible on screen is one of the more astonishing things I do regularly now and I can see the button, the user can see the button but clicking it takes three layers of shadow root traversal and still fails intermittently in ci for reasons I cannot consistently reproduce. We have built helper functions on top of helper functions to handle this and the test code is now more complex than the application code it is supposed to be validating and that is not a sustainable place to be in.

the deeper problem is that dom based testing was already showing its age before web components made it worse and the assumption that the structure of the html is a reliable proxy, for what the user experiences has always been shaky and modern frontend architecture is making it shakier every year

not sure if the answer is better tooling or a different testing philosophy entirely or just accepting that certain categories of ui complexity are going to keep breaking selector based approaches no matter how clever the helper functions get

u/Same_Technology_6491 — 4 days ago

How are you actually automating tests for highly dynamic front ends where the element IDs are totally randomized on every build

We have a really complex React application where almost all the class names and element IDs are dynamically generated on every single build so using standard static locators is basically impossible.

We tried using data-testid attributes everywhere but the developers hate it because it clutters the codebase and honestly they keep forgetting to add them to new components anyway which leaves QA constantly scrambling to find alternative ways to select elements using horrible nested CSS paths.

I am starting to think that treating the UI like a code document instead of a visual interface is just a massive anti pattern at this point so how are the rest of you handling hyper dynamic front ends without going completely insane.

reddit.com
u/Afraid-Bobcat6676 — 7 days ago