u/alejandro_such

▲ 0 r/devops

Multi-agent AI code review: 17 of 18 findings false. Lessons from burning credits

I've been personally experimenting with multi-agent code review: separate specialist agents for security, implementation, testing, and architecture, running them on every PR I open. On one last month, 18 findings reported, only 1 survived manual verification.

The failure modes were consistent:

  • Agents flagged files that weren't in the diff at all.
  • "code does X" claims without quoting the offending line.
  • Functions read in isolation, missing upstream guards.
  • Pre-existing patterns flagged as PR-introduced.

What stuck: a triage step before dispatching specialists. It pulls the + line list from gh pr diff, builds a small context bundle (changed files, repo conventions, analogous code), and picks which specialists to run based on PR size and surface. Docs-only? Skip security and architecture. Multi-file change touching auth? All four. Specialists must then quote the offending line for any finding they raise; if they can't, it's dropped before reaching me.

I haven't wired this into CI yet, but the triage step would slot in cleanly as a pre-merge check.

If you're running AI review on real PRs, how are you bounding the false-positive rate? Per-agent diff scope, post-hoc filtering, or something else?

reddit.com
u/alejandro_such — 1 day ago
▲ 3 r/github

One of the things we do at GitKraken, both internally and across our tools, is work a lot with GitHub Actions. So when we migrated one of our integrations from workflow_dispatch to repository_dispatch last week, the contract simplification was cleaner than I expected.

workflow_dispatch from a remote caller forces you to identify the workflow by filename or ID. Filenames aren't a contract: users rename .github/workflows/*.yml and your integration may break on the next push. IDs are stable, but resolving them needs a lookup (GET /repos/{owner}/{repo}/actions/workflows) baked into your client.

repository_dispatch drops the binding entirely:

POST /repos/{owner}/{repo}/dispatches
{ "event_type": "changes-accepted", "client_payload": {...} }

The event_type is the contract. The action listens with on: repository_dispatch: types: [changes-accepted] and reads client_payload.* as inputs. Renaming the workflow file doesn't break anything.

Removing workflow_dispatch from the action's on: block was the other win: exactly one way to fire the workflow remotely, no "two valid triggers" footgun.

workflow_dispatch still earns its place when humans need to trigger the workflow from the GitHub UI. For everything else triggered programmatically from outside the repo, repository_dispatch is the simpler contract.

reddit.com
u/alejandro_such — 14 days ago