
AI made our velocity metrics look great. Then the midnight pages started.
After rolling out an AI coding assistant, most teams see the same pattern: PRs get bigger, cycle times drop, sprint records fall. Feels great. Then a few months in, the on-call rotation gets brutal.
This isn't coincidence. The DORA 2024 report confirmed it across the industry: teams with significantly higher AI adoption also showed higher change failure rates.
Three failure patterns explain most of it, and none of them are new problems — they're old ones running faster:
1. Polished code fools reviewers. AI-generated code looks right. It follows conventions, reads cleanly, gets approved faster. But a model can produce a wrong implementation with the same fluency as a correct one. Reviewers pattern-match to familiar structure and skip the hard reasoning.
2. The model grades its own homework. When the same model writes the code and the tests, it tests its own assumptions — not your requirements. Coverage goes green. Edge cases nobody described stay untested.
3. AI can't see the whole system. The model only knows the code it's shown. It has no awareness of the shared retry queue, the upstream producer, the implicit guarantee held together by a three-year-old design decision. Clean-looking refactors quietly remove something critical.
The fix isn't slowing down AI adoption. It's redesigning the delivery process so it's worth amplifying:
- Write the spec before you write the prompt
- Tier changes by risk — anything touching payments or auth requires human business-logic review and a contract test against the live API
- Treat observability as a release gate — no monitoring dashboard, no merge
Teams that had strong practices before AI got faster. Teams that didn't started getting paged at midnight.
Full write-up with a FinTech case study (wrong field placement silently dropped disbursements during peak load, every unit test green): https://leaddev.com/ai/ai-coding-made-us-faster-why-did-incidents-increase