u/Annual_Upstairs_3852

I built a tool to make SAM.gov usable without APIs or SaaS

Kept running into the same problem:
SAM.gov has great data, but the bulk CSV is painful to work with.

So I built a small CLI tool that:

  • loads everything into SQLite
  • lets you search/filter instantly
  • ranks opportunities against your business

Everything runs locally.

Optional: plug in Ollama for summaries.

Would love feedback — especially on usability.

Repo: https://github.com/frys3333/Arrow-contract-intelligence-organization

reddit.com
u/Annual_Upstairs_3852 — 4 days ago

I built a real-world data tool (CSV → SQLite + ranking) — looking for feedback on my approach

I’ve been learning backend/data-focused programming and wanted to build something practical instead of just tutorials, so I picked a messy real-world dataset: the SAM.gov Contract Opportunities bulk CSV.

The problem:
The dataset is huge and not very usable directly (especially in Excel), so I tried to turn it into something queryable.

What I built:

  • ingest large CSV → store in SQLite
  • basic indexing + search (title / notice ID)
  • simple ranking system based on a “company profile”
  • CLI interface for browsing + shortlisting

I also experimented with adding an optional local LLM (via Ollama) for summaries, but most of the system is just standard data handling + logic.

Repo: https://github.com/frys3333/Arrow-contract-intelligence-organization

What I’m trying to learn / improve:

  • better schema design for this kind of data
  • how to handle updates to large datasets efficiently
  • whether SQLite is the right choice vs something else
  • structuring projects like this in a clean way

If anyone has feedback on:

  • code structure
  • data pipeline design
  • or things I’m doing “wrong”

I’d really appreciate it — trying to level up from small scripts to more real-world systems.

reddit.com
u/Annual_Upstairs_3852 — 4 days ago

Local-first pipeline for SAM.gov bulk data (CSV → SQLite + ranking)

Flow:

bulk CSV ingest

normalization into SQLite

deterministic ranking layer

optional local LLM summarization

No cloud infra, no APIs.

Main challenge was making large flat CSV usable for real querying.

Repo: https://github.com/frys3333/Arrow-contract-intelligence-organization

I am relatively new to programming so I would love feedback on:

schema design

indexing strategy

incremental updates

u/Annual_Upstairs_3852 — 4 days ago
▲ 5 r/govcon

Built a fully free tool to actually sort through SAM.gov opportunities (local, no API, ranks by fit)

I’ve been working on a free tool called Arrow to make SAM.gov a bit more usable.

The main issue I kept running into was how hard it is to actually triage opportunities. You can search, but figuring out what’s worth pursuing still ends up being very manual.

So I built something that:

pulls the full public SAM.gov opportunities dataset (no API needed)

stores everything locally so you can work with it fast

lets you search and rank contracts against your company profile (NAICS, mission, etc.)

highlights which opportunities are actually a good fit vs just keyword matches

optionally explains why something fits using a local AI model (runs on your machine, not cloud)

The goal isn’t to replace SAM.gov — just to make it easier to:

filter out noise

prioritize real opportunities

quickly scan large volumes of contracts

One thing I’ve noticed already:
most “search” tools surface a lot of irrelevant stuff, but when you rank based on fit + context, the top results get much more useful.

Still early, but I’ve been using it to scan thousands of opportunities much faster than manually browsing.

Curious how others here currently handle:

filtering opportunities

deciding what to pursue vs ignore

dealing with SAM.gov data at scale

check it out!

https://github.com/frys3333/Arrow-contract-intelligence-orginization

reddit.com
u/Annual_Upstairs_3852 — 5 days ago