u/Decagon25

▲ 1 r/1102

Analyzed 9,989 federal infrastructure contracts worth $30.6B, found 106 anomalies

I built an automated oversight engine called Ground Truth. It pulls every federal highway and bridge construction contract from USAspending.gov and runs a specialized anomaly detection pipeline.

The Methodology:
I used Median Absolute Deviation (MAD). Each of the 10,000 contracts is matched to a peer cohort (same State, same sub-agency, same NAICS code, and same project phase). If a contract is an extreme statistical outlier within its own peer group, it gets flagged.

Platform: https://ground-truth-beta.vercel.app

The Findings (Out of 9,989 tracked awards):

  • The NYC Bridge Security Outlier: A $450M Army Corps contract for security on Manhattan/Brooklyn bridges pricing at a staggering 1,260x the median cost of its peer group.
  • The 499x Runway: A $208M taxiway repair at NAS Oceana that lands as a 499.3x outlier against Virginia Navy paving contracts.
  • The Border Wall Variance: Fisher Sand and Gravel won a $177M wall contract at 286x the median. I also found two SLSCO wall contracts awarded on the exact same day off the same parent vehicle with a 2x per-mile cost variance ($14M/mile vs $7M/mile).
  • National Parks: Over $250M in extreme anomalies across the NPS and Forest Service, with some projects pricing at 44x the regional median.

Why this is different:
Every finding links to the official USAspending record and ships with a frozen set of comparable peer contracts. We explicitly list Innocent Explanations (terrain, hazmat, expedited timelines) on every page so the data acts as an objective starting point for reporters.

The Tech Stack:

  • Pipeline: Python (SQLAlchemy 2.x) with bulk-SQL optimization using Postgres Temporary Tables to handle 10k+ records without timeouts.
  • Storage: PostgreSQL (Neon)
  • Frontend: Next.js (TypeScript) + Tailwind + TanStack Query.
  • Validation: Currently in pilot with investigative watchdogs (including POGO and ProPublica) to refine statistical cost baselines.
reddit.com
u/Decagon25 — 3 days ago

[OC] Analyzed 9,989 federal infrastructure contracts worth $30.6B with 106 anomalies

I built an automated oversight engine called Ground Truth. It pulls every federal highway and bridge construction contract from USAspending.gov and runs a specialized anomaly detection pipeline.

The Methodology:
I used Median Absolute Deviation (MAD). Each of the 10,000 contracts is matched to a peer cohort (same State, same sub-agency, same NAICS code, and same project phase). If a contract is an extreme statistical outlier within its own peer group, it gets flagged.

The Findings (Out of 9,989 tracked awards):

  • The NYC Bridge Security Outlier: A $450M Army Corps contract for security on Manhattan/Brooklyn bridges pricing at a staggering 1,260x the median cost of its peer group.
  • The 499x Runway: A $208M taxiway repair at NAS Oceana that lands as a 499.3x outlier against Virginia Navy paving contracts.
  • The Border Wall Variance: Fisher Sand and Gravel won a $177M wall contract at 286x the median. I also found two SLSCO wall contracts awarded on the exact same day off the same parent vehicle with a 2x per-mile cost variance ($14M/mile vs $7M/mile).
  • National Parks: Over $250M in extreme anomalies across the NPS and Forest Service, with some projects pricing at 44x the regional median.

Why this is different:
Every finding links to the official USAspending record and ships with a frozen set of comparable peer contracts. We explicitly list Innocent Explanations (terrain, hazmat, expedited timelines) on every page so the data acts as an objective starting point for reporters.

The Tech Stack:

  • Pipeline: Python (SQLAlchemy 2.x) with bulk-SQL optimization using Postgres Temporary Tables to handle 10k+ records without timeouts.
  • Storage: PostgreSQL (Neon)
  • Frontend: Next.js (TypeScript) + Tailwind + TanStack Query.
  • Validation: Currently in pilot with investigative watchdogs (including POGO and ProPublica) to refine statistical cost baselines.

Platform: https://ground-truth-beta.vercel.app

u/Decagon25 — 4 days ago

9,989 federal infrastructure contracts worth $30.6B with 106 anomalies

I built an engine called Ground Truth that pulls federal highway and bridge construction contracts from USAspending.gov and runs statistical anomaly detection against peer cohorts. Each contract gets matched to similar projects by state, sub-agency, NAICS code, and contract phase, then scored using MAD z-scores. Only the ones that cross a strict five-rule publication gate get flagged.

Out of 9,989 tracked awards, 106 passed all five checks. Some patterns that emerged:

  • One contractor has 9 flagged contracts at California Navy airfields totaling $144M
  • Two border wall contracts on the same day from the same parent contract came in at $14M/mile and $7M/mile
  • The Department of the Army has the most flagged contracts at 52 out of 2,620 (2.0% rate)
  • The Department of the Navy has the highest rate among agencies with significant volume at 2.3%
  • Flagged contracts by quarter show a visible spike starting around 2022

Every finding links to the official USAspending record and ships with a frozen set of comparable peer contracts so anyone can verify the math.

Tools: Python for the pipeline, PostgreSQL, Next.js for the frontend. Data source is the USAspending bulk download API.

Platform: https://ground-truth-beta.vercel.app

Source: USAspending.gov, NAICS 237310 (Highway, Street, and Bridge Construction), federal prime awards only.

reddit.com
u/Decagon25 — 6 days ago