Building with congressional data in 2026... what am I missing? Because everything is dead
I’m building an open source tool to track congressional stock trades, donors, travel, and voting records. One platform, all the data, free and open. Simple idea.
Except I can’t find data that works.
I’ve spent the last 48 hours wiring up pipelines and every single source I try is either dead, broken, paywalled, or publishing PDFs like it’s 2004. I have to be missing something because this can’t be the actual state of civic data in 2026.
Here’s what I’ve tried:
Dead:
∙ ProPublica Congress API – shut down, repo archived Feb 2025
∙ OpenSecrets API – discontinued April 2025, now “contact sales”
∙ GovTrack bulk data – shut down, told everyone to use ProPublica (which then died)
∙ Sunlight Foundation – dead for years, tools lived on through ProPublica (which then died)
∙ timothycarambat/senate-stock-watcher-data – the repo everyone’s senate stock trade scrapers point to. Last updated 2021. Data stops around Tuberville’s first year. The guy who was literally the poster child for congressional insider trading isn’t in the dataset.
Barely functional:
∙ Congress.gov API – returning empty responses right now. Changelog says they’re deploying tomorrow. Also went fully dark last August with no communication.
∙ Senate eFD (efdsearch.senate.gov) – 503 errors on weekends. Runs on a Django app behind a consent gate. When it works, it works. It just doesn’t work on weekends.
∙ House financial disclosures – ASPX form with ViewState tokens. Feels like scraping a government intranet from 2005.
∙ SEC EDGAR – “works” but there’s no crosswalk between congressional bioguide IDs and SEC CIK numbers. Common names return false positives. You’re matching by name and hoping for the best.
Not even trying:
∙ House travel disclosures – PDF only. Quarterly scanned documents. No API, no XML, no structured data of any kind. Just PDFs you parse with pdfplumber and pray the table formatting is consistent.
∙ Senate travel – published in the Congressional Record as text dumps. Good luck.
Actually works:
∙ FEC API – functional, rate limited, but real data
∙ That’s basically it
Every GitHub repo I find for congressional data scraping is archived, abandoned, or points to APIs that no longer exist. Every nonprofit that used to aggregate this data has either shut down or gone behind a paywall. The raw government sources exist but they’re spread across six different agencies using six different formats with six different auth methods and zero shared identifiers.
I can’t be the only person who needs this data. What am I missing? Is there a source or project I haven’t found? Is someone maintaining scrapers that actually work in 2026?
I’m building it anyway (github.com/OpenSourcePatents/Congresswatch) but right now it feels like I’m assembling a car engine from parts scattered across different junkyards, and half the junkyards are closed on weekends.
What do you all use?
![[Self Promotion] Feature Extracted Human and Synthetic Voice datasets - free research use, legally clean, no audio.](https://external-preview.redd.it/KfAfmJjSzb0DJxpMCfxMr9dcyT8EYq6uZMNnL6El1rw.png?width=1080&crop=smart&auto=webp&s=8cf10e29bc494f9857aff5c79e2566c7bb5ec17a)