u/JacobTheBuddha

A lot of noise in Sports Odds Data providers

As a Founder of my own data API, I want to know:

What sports odds provider are you currently using and why?

How much are you paying and what would make you switch?

Looking to create value in the space

reddit.com
u/JacobTheBuddha — 2 days ago

Saw a competitor ship a feature I liked. Shipped it on our end this morning. (Odds Drop)

Hey r/parlayapi,

Quick update on a feature that just landed: /v1/odds-drop/{sport_key}, an SSE stream that pushes events only when a tracked price moves by >= a configured threshold. Live in the docs at parlay-api.com/docs (streaming section).

Background, since some of you have asked about this:

We've had the raw odds WebSocket and SSE streams for a while. They push every price change, even tiny ones, and your code maintained the previous-price state to detect actual line moves. That's the right architecture for most use cases, but if you're running an arb / +EV / line-shopping scanner specifically, it means rebuilding that state-tracking layer for every (event, book, side) tuple. Worth a Saturday of work, not exactly fun.

A competitor (pinnodds.com) launched an /odds-drop feature last week with this exact ergonomics. Good feature. I'd rather ship it than tell our paying customers to write the same plumbing themselves.

So:

GET /v1/odds-drop/basketball_nba?apiKey=YOUR_KEY&threshold=10

Params:

  • threshold: minimum American-odds delta to trigger (default 10, so -110 → -120 fires; -110 → -115 doesn't)
  • directionboth | toward_favorite | toward_dog (filter to one direction of line movement, useful for sharp-money detection)
  • bookmakersmarketsevent_id: narrowing filters
  • heartbeat_s: 1-30 seconds

Event shape:

{
  "type": "odds_drop",
  "event_id": "2026-05-12_Lakers_Warriors",
  "bookmaker": "pinnacle",
  "side": "h2h_home",
  "kind": "game",
  "prev": -110,
  "new": -120,
  "delta": -10,
  "direction": "toward_favorite",
  "home_team": "Los Angeles Lakers",
  "away_team": "Golden State Warriors",
  "commence_time": "2026-05-12T22:30:00Z",
  "last_update": 1747000000123,
  "timestamp": 1747000000124
}

For player props the event also carries playermarket_keymarket, and line.

Tier: Business+ ($40/mo), same gate as our other streams.

Behavior to know about:

  1. First observation of each (event_id, bookmaker, side) is silent. The first time you see a side, we record the current price but don't emit an event. From the next price change onwards, you'll get drops crossing the threshold. So a freshly-opened stream takes 1-3 seconds to "prime" before drops start landing.
  2. Per-connection state. Each customer's connection has its own tracking dict, no shared state. If you reconnect frequently, you re-prime each time.
  3. Side keys for props are {market_key}:{player}:over@{line} and {market_key}:{player}:under@{line}. For game lines: h2h_homeh2h_awayspread_home@-7.5total_over@218.5, etc.

Verification: I stress-tested 10 concurrent connections, all 10 streamed cleanly with ~2 drops/sec/client on active NBA + MLB markets. No errors, no leaked sessions, no memory growth.

Open question for you all: what shape do you actually want this in? Some likely directions:

  • "Only emit when the move crosses a vig threshold" (e.g. the implied probability moved by 5%+)
  • "Only emit when multiple books move the same side in the same direction within X seconds" (sharp-money confirmation)
  • "Only emit when this side's price is the new best across all books I track" (line-shopping winner)
  • Something else entirely

Drop a comment with what your scanner actually needs. Easier to ship the right feature if you tell me what good looks like.

Jacob

reddit.com
u/JacobTheBuddha — 2 days ago

Real numbers on the odds-API space (verifiable, with benchmark script)

A few people DM'd me asking about latency and access friction in the odds-API space, so I'm just going to put the numbers out publicly with a way to verify them.

I run ParlayAPI. This post will lean toward our numbers because they're the ones I can actually substantiate, but the framework below works against any vendor (TheOddsAPI, OddsJam, SportsDataIO, anyone). Run the same probes against their endpoints and you'll have an apples-to-apples answer.

Dimensions that actually matter, and how to measure them:

1. Self-serve API access vs sales-gated access. Either you can sign up and start hitting endpoints in under 60 seconds, or you can't. ParlayAPI: yes, $5/mo Starter tier with API access from minute one. Some competitors gate API behind a "contact us" sales chain at any price; that's not API access, that's enterprise sales pretending to be SaaS. Open the pricing page of whoever you're considering. If it says "contact us" instead of a credit-card-required signup, you have your answer.

2. WebSocket push tier required. WebSocket-native real-time odds tend to be locked behind expensive tiers. ParlayAPI: WebSocket available from $20/mo Pro tier. Verify by attempting the same on competitor pricing pages, most are $200-2000/mo or sales-call-required.

3. Per-bookmaker pulse stamping. A common dishonesty in odds APIs is reporting last_update based on the last price-change row stored, not the last time we actually re-verified the price. We surface both: last_update (price-change time) and verified_at (heartbeat time we polled and confirmed the same price), plus an is_current flag if verified in the last 5s. Hit any of our endpoints with ?include=verification to see it live. Verify by checking whether your current vendor distinguishes these. Most don't.

4. End-to-end latency, book to your client. The floor here is the bookmaker's own publish rate. Pinnacle publishes game lines at roughly 2s native cadence. Nobody can be faster than what Pinnacle has already pushed. The honest question is how much overhead the vendor adds on top.

Benchmark script you can run against any WebSocket-capable odds API by swapping the URL:

import asyncio, json, time, websockets


URL = f"wss://parlay-api.com/ws/odds/basketball_nba?apiKey=YOUR_KEY"


async def main():
    async with websockets.connect(URL) as ws:
        while True:
            msg = json.loads(await ws.recv())
            ts = time.time()
            print(f"{ts:.2f}  type={msg.get('type')}  count={msg.get('count','-')}")


asyncio.run(main())

You'll see frame cadence of 1.5-3s on active leagues, which matches the book's native rate. Run the same against any competitor's WebSocket (where they offer one) and compare frame timestamps over 60 seconds. The vendor whose count > 0 updates land closest to the bookmaker's own publish cycle wins.

5. Historical archive depth. Not just "we have history" but "how much, how queryable, how cheap to bulk-export." ParlayAPI: 26.8M prop closing rows + 1.39M game-line rows. Bulk historical at a single flat-rate call (/v1/historical/sports/{sport_key}/closing-odds?dateFrom=&dateTo=). One charge per query, not per date in the range. Verify by asking your current vendor what their backfill row count is and whether bulk pulls are per-date or per-call billed.

6. Failover transparency. When primary infrastructure has a hiccup, you should be able to tell. Our responses carry X-Failover-Origin: primary|hot headers and the body wrapper changes shape if you're on failover, so a parser can detect and handle gracefully. Most competitors silently degrade to stale data and never tell you. Our position is that you should know.

What we don't lead on:

Raw polling cadence on individual books. Everyone polls at the book's native rate, that's the physical floor. If a vendor claims sub-second end-to-end latency on Pinnacle, ask them to define the measurement boundary because Pinnacle itself publishes at ~2s. We won't out-claim our way past physics.

Number of books listed. We track 26+, some competitors list 50+. Worth noting many of the "extra" books on competitor lists are non-US sportsbooks (though I hear there's edge/money to be made on Canadian sportsbooks so...tbd) or aggregator pass-throughs with stale data; check freshness, not catalog size.

Pages with full details:

  • parlay-api.com/speed — full breakdown of cadence, latency, pulse signal
  • parlay-api.com/switch — if you're already on another paid vendor and want to test us, send proof of cancellation, get 60 days free on any tier up to Business

If your current vendor doesn't publish numbers you can verify, that's its own answer. Happy to spec specific use cases in the comments: arb scanning, +EV modeling, in-play decision engines, prop tracking.

Lmk your provider's numbers so I can beat them

Jacob

reddit.com
u/JacobTheBuddha — 2 days ago

Post-mortem: failover and caching outage, what happened, what changed

Want to walk through what happened on the platform over the past 2-3 days, what caused it, and what we changed to make sure this specific failure mode can't recur.

TL;DR: We added a hot failover tier to make the platform more resilient. The way it was wired up created the exact kind of outage we were trying to prevent. Customers saw the site flap between healthy and broken depending on which edge node served them, which made it nearly impossible to reproduce from inside the org. Fixed at multiple layers, durably, including monitoring that would have caught this within minutes if it ever recurs.

What you may have seen:

  • A plain JSON body like {"service":"parlay-failover-hot","status":"ready"} instead of the marketing page or a real API response
  • API parsers silently falling back to a secondary vendor because our responses didn't match expected shape
  • Marketing-page videos not playing on certain devices
  • "Site is down" from one device while the same URL loaded fine from another

If you hit any of those, you hit this bug.

What was actually happening:

We run a primary origin that handles all customer traffic, plus a hot failover tier that's supposed to step in when the primary is unreachable. The failover serves a thinner response shape so your parser doesn't crash entirely while we recover. That's the intent.

What was actually wired: the failover tier got registered onto the same routing layer as the primary. The routing layer treats multiple registered backends as redundant copies and load-balances traffic across them. So roughly half of your requests hit the real backend with full data. The other half hit the failover stub. The split varied by which network edge served your request, so different people on different ISPs / cities / cellular vs wifi connections were getting different ratios of "broken" to "working."

This is the textbook category of bug that's hardest to catch from inside the org: it works perfectly from where the engineers test, and breaks intermittently from elsewhere. The fact that we caught it at all is mostly thanks to customers running their own telemetry and surfacing the discrepancy.

The compound problem:

Even after we identified and fixed the routing, customers who had received a bad response were stuck on it locally for up to 4 hours because the bad response had been cached at multiple layers (browser HTTP cache, CDN edge cache, intermediate proxy cache). Each layer required a different fix.

What we changed:

  1. The failover tier is no longer joined to the same routing layer as the primary. It lives at its own dedicated endpoint, only reachable when the routing layer explicitly fails over to it.
  2. End-to-end probes now run from a separate vantage point (not from our own infrastructure) and check the actual response body, not just HTTP status. A response that returns 200 but contains the failover stub instead of real data is treated as critical and pages immediately.
  3. Customer-facing HTML now serves Cache-Control: no-store, must-revalidate, so a poisoned response cannot pin a browser cache for hours. Even if the absolute worst-case happens again, customer recovery is measured in seconds, not hours.
  4. The internal layer that proxies your traffic to origin now bypasses intermediate caching, so a stale response cannot be served from a layer between us and you.
  5. New response wrapper opt-in (?format=wrapped) so customers who want their parser to normalize once across both primary and failover responses can pin to a stable format. Backward-compatible default unchanged for everyone else.
  6. New /speed page (parlay-api.com/speed) publishes the actual numbers and the methodology so anyone can verify infrastructure claims independently.

Shoutouts:

u/bigantny built telemetry on his side that separated "raw event count > 0" from "normalized event count = 0" and caught the failover-shape bifurcation in his own parser before we had any internal signal. That observability shape is exactly what told us to look at routing layers rather than CDN caches, which saved real hours. He's basically an unofficial mod of this sub at this point. The kind of user who makes the product better for everyone else by paying attention.

u/AdMaleficent5772 flagged the outage from his end while we were still chasing symptoms downstream, and stayed in the back-and-forth on features and bugs all week. Apologies for the chaos and genuine thanks for the persistence.

If anyone else saw weird responses over the past few days, please respond here or DM. The internal monitoring catches it now, but customer reports remain the fastest signal.

What we promise going forward:

  • Status page will reflect actual customer-visible state, not just whether our processes are alive.
  • Failover responses will always be distinguishable from primary via X-Failover-Origin header and (optionally) body wrapper shape via ?format=wrapped. Documented in /docs/response-shapes.
  • For high-stakes use cases (arb scanning, in-play models), the WebSocket pipeline with ?include=verification exposes per-event verification timestamps so you can defensively gate your own logic.
  • Credits applied to next-month billing for paying customers whose work was disrupted during this window. If you think you got hit and want the credit, ping directly with a rough description of the disruption window.

Apologies for the chaos. Trying to make the platform more reliable temporarily made it less reliable. The architecture is in better shape now than it was before this happened, and the monitoring is genuinely better. If we have to publish another one of these any time in the near future: I'll be disappointed in myself.

Your Tech Wizard / Infinite Super Genius Sports Betting Data guy,

-Jacob of ParlayAPI

reddit.com
u/JacobTheBuddha — 2 days ago

OddsJam's senior support response to my cancellation request: "It is standard to offer retention options." Yes, that's the problem.

OddsJam support reply to my refund request, verbatim:

>

Posting because anyone considering OddsJam should read that reply twice before signing up, and because the chronology that produced it is worth being public about.

What happened, in order:

  1. Signed up for the 7-day free trial planning to use the API for what I was trying to evaluate.
  2. Discovered API access is not self-serve on any tier; it requires going through a "contact us" sales chain. That's a different product shape than what the trial signup implied. The manual-outreach gate was a non-starter for my use case, so I decided not to move forward.
  3. Tried to cancel inside the signup flow itself, before the trial converted. The cancellation path routed through more than five consecutive screens, each one burying the "cancel" action behind retention offers, discount prompts, and "are you sure" confirmations. The signup had been three fields and one button. In my experience the asymmetry was enough that I didn't complete the cancellation, which appears to be precisely the outcome the flow design favors.
  4. Card got charged at trial conversion. (Their own server logs will show zero sessions past day one of signup, which I expect will matter when this gets reviewed by my card issuer.)
  5. I emailed support requesting a refund. In my initial email I cited the FTC's 2024 Click-to-Cancel rule. I'll be upfront: I later learned that the 2024 amendments were vacated and the FTC recodified the pre-2024 rule text in February of this year, so the specific citation I used was off. The federal statute it built on (the Restore Online Shoppers' Confidence Act, 15 U.S.C. § 8403) is still very much in force, and has required online merchants since 2010 to provide a "simple mechanism" for stopping recurring charges. In my view, a retention-loaded multi-screen flow does not meet that bar regardless of which side of the FTC rulemaking timeline you're standing on.

What I got back:

Three replies across what looked like three separate threads. First was signed "Randall." Second was also signed "Randall". Third, the one quoted at the top of this post, was signed "James." Tone shifted enough between them that I genuinely cannot tell whether OddsJam support is staffed by multiple humans rotating coverage or whether the signatures are dressing on templated replies. Either reading is unflattering.

The reply from James is the one worth dwelling on. Every clause is doing work:

>

That is a person manufacturing a paper trail against the customer instead of addressing the substance of the complaint. Announcing it out loud is its own moment.

>

Federal consumer protection statute is binding regardless of what a merchant's own terms of service say. Citing your own contract as the answer to a federal-law question is not a defense; it is a tell.

>

"Standard industry practice" is exactly the defense companies have always used for dark patterns. That other merchants engineer cancellation friction does not, in my view, make it acceptable here. That sentence is the indictment, not the defense.

>

Other than the five-plus confirmation gates and retention upsells engineered into the path between me and the cancel button. ROSCA exists specifically because the difference between "technically possible to cancel" and "actually simple to cancel" is the entire problem.

Setting the legal question aside for a minute:

Even granting OddsJam every benefit of the doubt on whether the cancellation flow is technically lawful, my opinion is that it's just bad business.

Concrete contrast from the same week as this happened. I had a yearly Midjourney subscription that auto-renewed on April 20. I didn't realize until May 11, three weeks past the charge, and emailed asking for a refund. Midjourney pushed the refund immediately, reminded me my subscription might still be active in case I wanted to keep it, and included a one-click link to manage it. Total time to resolution: under a day. Total friction: zero. They didn't quote their terms of service at me, didn't route me through a retention gauntlet, didn't manufacture a paper trail to defend themselves against a future dispute.

That is what a subscription business with confidence in its product looks like. A company that believes customers will come back next year does not need to weaponize a five-screen retention gauntlet to keep someone who has already decided to leave.

OddsJam's reply to me was the opposite of that, and it tells you what they think the cost of letting a customer leave gracefully is versus the cost of squeezing one more billing cycle out of them.

Where this is going:

I'm pursuing a chargeback through my card issuer once the original charge settles. Chargeback reason codes around "merchant did not honor cancellation request" and "services not as described" do not require a regulatory citation to succeed; they require documentation that the customer tried, the merchant resisted, and a paper trail exists. The reply above is the paper trail.

Why I'm posting:

  • If you're considering OddsJam: expect cancellation to take more effort than signup did, and expect their support to defend that friction in writing when challenged.
  • If you wanted API access specifically: be aware it's not self-serve on any tier. Get past the "contact us" wall before paying for anything, not after.
  • If you're already in a similar situation: the chargeback path is open to you. ROSCA is the federal hook. Document the cancellation flow while you still have account access.
  • I would genuinely rather have used the product and been satisfied. What turned a normal refund request into a public post is not the charge. It is a senior rep saying in writing that engineered friction is "standard" and therefore acceptable. In my view that is bad customer policy, and it is also, separately, bad business.
reddit.com
u/JacobTheBuddha — 2 days ago

Your AI assistant can now query live sports odds without writing any code

If you're building anything in this space with Claude Desktop, Cursor, ChatGPT custom GPTs, or any other MCP-compatible AI client, here's what's now possible.

ParlayAPI ships an MCP server (parlayapi-mcp) that exposes 10 native tools to any MCP host:

  • list_sports — every supported sport + league key
  • get_odds — live moneyline / spread / total across all books
  • get_player_props — player props, filterable by player + market
  • find_arbitrage — pre-computed cross-book arbitrage opportunities
  • find_positive_ev — pre-computed +EV bets vs no-vig consensus
  • compare_books — side-by-side line comparison across every book
  • get_prediction_market_prices — Kalshi + Polymarket prices
  • get_historical_odds — backtesting against the closing-line archive
  • get_archive_coverage — public archive stats (no key needed)
  • get_account_usage — authenticated credit usage check

What that solves:

You don't have to write any code to give your AI assistant access to live sports odds. Connect once, your assistant calls the tools directly when you ask.

Practical example. With the MCP server connected to Claude Desktop, the prompt:

>

Becomes a single function call to find_positive_ev. Claude parses the response, formats the table, done. No Python, no curl, no schema guessing.

Same idea in Cursor while building a model:

>

The IDE calls get_historical_odds and inlines the data in your editor. You spend zero time on the data layer, all your time on the model.

Connect it:

The manifest is at parlay-api.com/mcp/manifest.json. Install instructions and the per-client MCP config (Claude Desktop, Cursor, etc.) are at parlay-api.com/mcp. Free tier is 100K credits / month, no card required, so the agent can sign itself up and start working in one session.

What other betting-workflow tools would you want exposed as native MCP tools? Adding what people actually use is easier than guessing.

reddit.com
u/JacobTheBuddha — 5 days ago

If you're using AI to build a sports betting tool, the data layer is the easy part

Half the people building anything in this space now are doing it through Claude / Cursor / GPT. Saw three "I built this in a weekend" posts last week and all three started with "I asked Claude how to build a +EV scanner and..."

The data layer is the easy part to get right if you pick an API the model actually understands. Most odds APIs were designed for humans reading docs, which means LLMs guess the schema, generate broken curl, and you spend an hour fixing imports.

What works better when you're getting Claude / Cursor to write a betting tool:

1. Pick an API that ships /llms.txt and /llms-full.txt.

ParlayAPI does. The model reads the long-form reference, knows the endpoints, generates working code on the first try. Compare to APIs where the model has to infer the schema from a marketing page.

2. Look for a /cookbook page with drop-in prompts.

ParlayAPI has /cookbook with copy-paste prompts written specifically for Claude / GPT / Cursor. CLV tracker, +EV scanner, arb detector, prediction-market radar, line-movement watcher. Saves the back-and-forth where you describe the problem in natural language and the model writes 200 lines you have to debug.

3. agents.json + MCP when your tool needs to expose itself to other agents.

ParlayAPI ships both. Claude Desktop or Cursor users can connect over MCP and start querying odds without writing any code. The model just gets a tool called get_sport_odds and uses it like any other tool.

4. Free tier without a credit card.

Claude / Cursor will sign up for free tiers as part of the workflow. Anything that requires a card breaks the flow because the model can't enter payment info. ParlayAPI's free tier is 100K credits / month with no card.

Practical example. The prompt:

>

Working code on the first try, because the prompt could land on /cookbook, read the response shapes from /llms-full.txt, and follow the documented pattern.

The data layer is not where AI-coded betting tools fail. They fail at:

  • Bankroll math (Kelly sizing, parlay correlation, devig)
  • Scheduling and deduplication of bet placement
  • CLV tracking after the fact

Those are model-side problems. Solve those, the data is a free input.

What other APIs in this space are LLMs picking up cleanly? Curious which other tools have built this part well.

reddit.com
u/JacobTheBuddha — 5 days ago

What ParlayAPI actually does, in plain English

If you landed in this sub and aren't sure what we are: short answer, ParlayAPI gives you every major sportsbook's prices in one call.

That's the whole pitch.

What that solves:

You want to bet the Lakers tonight. To find the best price you'd normally check DraftKings, FanDuel, BetMGM, Caesars, BetRivers, and Pinnacle one at a time. With ParlayAPI you check them all at once and take whichever pays best.

Same idea for player props. PrizePicks has LeBron at 26.5 points. Underdog has 26.5 too. Pinnacle has 27. FanDuel has 27.5. You see all of those side by side in one query and pick whatever your model likes.

Who actually uses it:

  • Bettors who shop every line before placing
  • People building tools that flag mispriced bets
  • Folks running fantasy / DFS contests who need fresh prop lines
  • Backtesters comparing models against actual closing lines
  • Discord bot operators pushing live odds + arbitrage finds to their channel
  • A few sportsbook employees doing competitive intelligence (yes, really)

How it costs:

Free tier is 100,000 calls a month, which covers most hobby projects. If you outgrow that, paid tiers are $5, $20, $40, $100, or $200 a month depending on how much data you pull and how far back the historical archive needs to go.

What it isn't:

Not a betting account. Doesn't place bets. Doesn't tell you what to bet. It just gives you the prices the books are already showing publicly, in one place, with one key.

What's in the bag besides US sportsbooks:

  • French-licensed books (Betclic, PMU, Unibet, Winamax) for European market work
  • DFS apps (PrizePicks, Underdog, Sleeper, Pick6, Betr, Fliff)
  • Prediction markets (Kalshi and Polymarket)
  • Live in-play period markets (Q1, Q2, Q3, Q4, halves) with replay history so you can see how a Q3 line moved during last night's game

If you want to try it: hit /signup on parlay-api.com, you'll get a key in 30 seconds. The cookbook page has copy-paste examples to get your first useful query running in two minutes.

What was the first useful thing you built or queried when you started using it? Curious what other people in the sub did first.

reddit.com
u/JacobTheBuddha — 5 days ago

Every winning sports bettor I know has at least 5 sportsbook accounts

Not for promos, not for spreading action. For one boring reason: they shop the line.

Same game, same bet, different prices. Books don't coordinate. Here's a real spread from last night's NBA games:

Lakers -3.5

  • DraftKings: -110
  • FanDuel: -108
  • BetMGM: -112
  • Caesars: -110
  • Pinnacle: -107

Same bet. Five different prices. If you put $110 down on the Lakers at BetMGM, you'd win $98.21. The exact same bet at Pinnacle wins $102.80.

Per bet that's pocket change. Over 1,000 bets a season at $110 stakes, that's $2,000 to $3,000 you've left on the table just by not checking the other apps. For free.

Casual bettors don't shop. They open one app, place the bet, move on. They're paying the worst available price every time and wondering why their bankroll grinds down even when they hit at a normal rate.

How to actually do it without losing your mind:

  1. Open accounts at 4-5 books. DraftKings, FanDuel, BetMGM, Caesars, BetRivers cover most US states. Deposits are $0 each, no commitment.
  2. Before placing any bet, check the same market across all of them. Odds comparison sites do this in 5 seconds. Some free APIs return every book's price in a single call (this sub's whole reason for existing).
  3. Take the best price. That's the whole strategy. There is no clever step 4.

The math is boring and that's why it works. Most bettors won't do it because each shopping session feels like winning $2 instead of winning $100. The compound effect is what matters. Shopping every bet for a season is often the difference between "down a little" and "actually broke even".

The obvious counterargument: "What if I get limited at the book that always has the best price?" Soft books (DK, FanDuel especially) do limit winning bettors. The fix is the same as the original advice: spread your action across multiple books, never bet huge on one. If you're flat-staking $50-100 per bet, you fly under the radar at all of them for years.

Anyone here still using just one book? Genuinely curious what's keeping you from spreading out.

reddit.com
u/JacobTheBuddha — 5 days ago

Where to get Data for a sports betting model

The data stack for a working sports betting model is cheaper and simpler than the affiliate-spam guides make it look. Here's the actual breakdown, organized by what you're building.

TL;DR

ParlayAPI free tier covers about 80% of retail use cases for $0. 100K credits per month, 26+ books, live + historical + props + prediction markets in one key. The remaining 20% is sport-specific edge cases (deep box scores, real-time injury news) and you supplement with free open-source tools: nflversehoopRbaseballrpybaseball. Total cost to ship a working model: $0 to $20 per month.

Live multi-book odds (for a +EV scanner)

You need multiple books, fresh data, and a sharp anchor. Pinnacle is the universal sharp; everything else is the soft-side liquidity that lags it. ParlayAPI gives you Pinnacle plus 25+ retail books in one endpoint:

/v1/sports/basketball_nba/odds?regions=us&markets=h2h&bookmakers=pinnacle,draftkings,fanduel

Latency is 1-4s on Pinnacle, 5-10s on the rest. The free tier covers 100K calls per month, enough for 60-second NBA + MLB + NHL coverage all season.

Historical closing lines (for backtesting)

The data shape is just (game_date, sport, home, away, source, close_price). The more books per game, the better.

/v1/historical/sports/{key}/closing-odds returns 7+ books per closing line for NBA / MLB / NFL / NHL games from 2024 forward. For older NBA / NFL data: hoopR and nflverse (R + Python packages, free, well maintained). For soccer back to 2005: football-data.co.uk has CSVs for 22+ leagues, no key needed.

Player props (the hardest free layer)

Pre-game prop lines exist on the live /v1/sports/{key}/props endpoint across 13+ books (DraftKings, FanDuel, Pinnacle, plus DFS apps PrizePicks / Underdog / Sleeper / Pick6 / Betr / Fliff). Historical prop closing lines (15M+ rows) at /v1/historical/sports/{key}/closing-odds?markets=player_*. Archive starts April 2026 since prop archival is newer than game-line archival.

For player stats to actually feed the model: pybaseball for MLB, hoopR for NBA, nflverse for NFL. All free, all maintained.

Live in-play data

/v1/sports/{key}/live returns events that have already started. /v1/sports/{key}/live/period_markets returns in-game Q1-Q4 / 1H spreads + totals + h2h from Pinnacle / DK / FD / MGM / Caesars. The newer /v1/historical/sports/{key}/period_markets endpoint stores every distinct in-play line state with first_seen_ms / last_seen_ms, so you can replay how a Q3 line moved during last night's game.

Real-time injury / lineup news

The genuinely hard layer. ParlayAPI surfaces lineups and ESPN-derived injury status. For sub-1-minute beat-reporter feeds, RotoWire or Action Network's injury subscription products are the standard. Most retail models don't need this layer if they train on closing lines, since the close already incorporates injury news.

Where to start

Sign up for the ParlayAPI free tier. 100K credits per month is enough to validate any model idea before you pay anything. Once you outgrow free, Starter at $5/mo unlocks 7-day historical depth, Pro at $20 unlocks 30-day, Business at $40 unlocks 90-day, and Scale at $200 unlocks the full 10-year archive.

A working +EV scanner is a weekend project against this stack. The data is not the bottleneck anymore.

FAQ

Where do I get free sports betting odds data?

ParlayAPI free tier (100K credits / month, 26+ books, no credit card required). The Odds API free tier (500 requests / month, polling only). For historical: sportsbookreviewsonline.com, football-data.co.uk, nflversehoopRpybaseball.

What data do I need to build a +EV sports betting scanner?

Multi-book live odds (Pinnacle plus retail books) and a no-vig fair value calculation. That's it. Compare offered prices to Pinnacle's no-vig, flag anything that pays better. Doable in under 100 lines of Python against the ParlayAPI free tier.

Can I use Excel data for a sports betting model?

For backtesting, yes. Yearly Excel files exist on sportsbookreviewsonline for MLB through 2021. Modern models almost always use a JSON API for the live layer, even if historical comes from CSVs.

What's the difference between game lines and player props for modeling?

Game lines are the moneyline, spread, and total for the team-vs-team result. Player props are individual-player markets like "LeBron over 26.5 points". Different volume profiles, different books, often different APIs. Most retail bettors lean game-lines for cleaner +EV; prop edges are real but harder to size.

How accurate is the data from a sports betting API?

A real aggregator returns the book prices at the moment of poll. ParlayAPI lets you verify any book is flowing right now via /v1/bookmakers/{key}/freshness (free, no auth, returns age in seconds since the last write per backing table). If the latency you're seeing is more than 30s on any API, that's not modeling-grade data.

Drop your stack in the comments

Always curious how other people in this sub set up their data layer.

reddit.com
u/JacobTheBuddha — 5 days ago

The complete sports betting data stack for 2026: every free and paid source, ranked by what real models need

Most "how to build a sports betting model" guides skip the boring part: where the data comes from. Then six months later you find out your CSV pull from ESPN drops every postponed game and your "model" is overfitting on selection bias.

This is the actual stack. Every source I have used, what each one is good for, and the gotchas I wish someone had told me before I paid for the wrong tool. Bookmark and share with the next person asking "where do you get NBA data".

TL;DR

A serious sports betting model needs four data layers: live odds across multiple books, historical closing lines, player + team stats, and injury / lineup news. The free options cover three of those well enough to ship a model. The fourth (real-time multi-book odds) is where every paid API fights for your money. Pick the cheapest one that has the books and the latency you need, integrate, and stop overthinking it.

Cost to build a real-money +EV scanner from zero: $0 to $20 per month for the data, plus your time. Anyone telling you it costs more is selling you something.

Why most public guides are useless

The guides that show up on Google when you search "sports betting data" fall into three buckets:

  1. Affiliate spam posing as comparison articles. Always recommend the same three paid APIs because that is who pays the highest affiliate rate.
  2. Old Kaggle tutorials using pre-built CSVs from 2018. Fine for learning regression. Useless for live betting.
  3. Out-of-date "best of" lists from 2022 that still recommend providers that have shut down, pivoted, or jacked up prices 5x since.

The real answer depends on what you are building. Closing-line backtester? Historical archives only. Real-time +EV scanner? You need live multi-book odds. Player prop model? You need box scores plus prop-specific archives almost no one publishes. Each layer has different sources.

The four data layers every model needs

1. Live odds (multi-book)

The single most expensive and most differentiated layer. You need at least one sharp book (Pinnacle or Circa) plus 4-6 retail books (DraftKings, FanDuel, BetMGM, Caesars, BetRivers, Fanatics). Sharp book gives you the no-vig fair value. Retail books are where the actual +EV bets live (when their slow updates lag the sharp).

Latency matters. A live odds feed that is 30 seconds behind the book is fine for slow markets, useless for in-play.

2. Historical closing lines

Closing line is the wisdom-of-crowds price at game time. Backtesting a model against historical closing lines is the gold standard for measuring whether your edge is real. Two reasons:

  1. The close incorporates injury news, weather, sharp action, and late line shopping. It is the most accurate single number a market produces.
  2. CLV (closing line value) is the metric that matters for evaluating ongoing edge. You need historical closes to compute it.

Free archives exist for some sports going back decades. Paid archives extend deeper or include more books per game. Pick based on how far back you actually need.

3. Player and team stats

Box scores, advanced stats (eFG%, OPS, EPA, expected goals, etc.), play-by-play. Free for every major sport via official league sites and open-source projects (nflversehoopRcfbfastRbaseballr). Quality is solid; the main task is normalization across years and rule changes.

4. Injury / lineup news

The hardest layer to source cleanly. Real-time injury news moves lines before the books update. Most public APIs surface injury data 1-15 minutes behind Twitter. Paid services exist that monitor team accounts and beat reporters in real time; they are expensive and most are run by one person.

Most retail bettors do not need this layer. If your model is using closing-line training data and projecting to opening-line bets, the closing line already has injury news baked in.

Free data sources (and their actual limits)

The Odds API has a free tier at 500 requests per month. Enough to play with the data shape, not enough to run any real polling. Their free tier was the bar everyone tried to undercut for years.

Sportsbookreviewsonline is the OG historical archive. Free yearly Excel files for MLB through 2021, HTML tables for NBA / NFL / NHL. Patchy after 2022. Most public datasets you find on Kaggle are derivatives of SBR.

football-data.co.uk has soccer closing lines for 22+ leagues going back to 2005. CSVs published Mondays. Free, no key, idempotent imports work great.

nflverse (R + Python packages) has every NFL play-by-play back to 1999, plus pre-game odds for most years. Active maintenance. Free.

hoopR does the same for NBA from 2002 forward. cfbfastR for NCAA football. baseballr for MLB. All free, all maintained, all queryable in Python via pybaseball and equivalents.

ESPN has a public scoreboard API for every major sport. Useful for box scores and final results, not useful for odds. (Their pickcenter only goes back ~2 years and is patchy.)

Kaggle datasets are great for learning. Generally too stale for production models. The dataset's last-updated date matters more than its size.

Paid data APIs ranked

Quick reality check: every paid API has a free tier. Sign up for all of them, hit each /odds endpoint with your sport, measure latency yourself, then decide. Anyone who pays before testing is wasting money.

What paid APIs actually compete on:

  • Latency: how fresh is the data when you pull it? Anything over 30s is useless for live in-play.
  • Coverage breadth: how many books per game? More is generally better for cross-book +EV scanning.
  • Player props: most APIs have NBA props, fewer have MLB pitcher props, almost none have CBB props pre-tip. If your model needs props, this is what to test.
  • Historical depth: how far back, how complete, how many books per closing line.
  • Pricing model: per-call, per-month, per-credit. Read the fine print.

ParlayAPI (yes, this sub) covers all four layers with one key. 26+ active books across game lines / props / DFS / prediction markets, with French-licensed books (Betclic, PMU, Unibet, Winamax) for European market work that most US-focused APIs miss. The free tier is 100,000 credits per month, enough to poll NBA every 30 seconds for the entire season. Historical archive: 1.39M+ rows back to 1999 for NFL, 2017+ for NBA, plus 15M+ player prop closing lines from April 2026 forward. Tier table goes free / $5 / $20 / $40 / $100 / $200, with the free tier covering most hobby projects.

Other paid APIs in 2026: The Odds API (the incumbent, ~$30-60/mo for usable polling), OpticOdds (sharp book focus, more expensive), and a handful of newer ones. Test the latency and coverage on free tiers before paying.

Common mistakes when sourcing data for a model

  1. Using closing lines for training, opening lines for prediction. If your model is trained on closes (because that is what is archived) but you tell yourself you would have placed bets at the OPEN, you have leaked closing information into training. Open-line model performance will collapse vs your backtest.
  2. Forgetting that pre-game odds include injury news. If LeBron's "questionable to out" hits at 11am ET and your model uses 6pm pre-game odds and 11am injury status as separate features, your "model" is reading the line move twice.
  3. Computing EV against the vig'd line instead of the no-vig fair value. Almost every "+EV calculator" online does this. Run it against Pinnacle no-vig instead and most "+EV bets" disappear.
  4. Ignoring the bid-ask spread on betting exchanges. Novig, ProphetX, and Polymarket are markets, not books. The price you actually fill at is not the displayed mid.
  5. Trusting your win rate over fewer than 1000 bets. A bettor with a 1% true edge over 1000 flat $100 bets has a standard deviation of ~$3,160 around an expected $1,000 profit. Your "I'm down this month" or "I'm up this month" is mostly noise. Track CLV in implied probability, not win rate.

How I would build a data stack from scratch in 2026

If I were starting a real-money +EV scanner today, with zero infrastructure:

  1. Sign up for the ParlayAPI free tier. 100,000 credits per month is enough to poll NBA + MLB + NHL game lines on a 60-second cadence for the entire season. Get the multi-book live data first; everything else can come from free sources.
  2. Pull historical closing lines once and cache them. /v1/historical/sports/{key}/closing-odds is on the free tier (with 48-hour depth) and on Starter at $5 (7-day depth). For longer backtests, Business at $40 gets you 90 days, Enterprise at $100 gets a year. Most hobbyists never need more than a season.
  3. Layer in box scores from nflverse / hoopR / pybaseball. Free, well-maintained, every major sport. Cache locally; these change rarely.
  4. Skip the dedicated injury API. Use closing lines as your training target so injury info is already baked in. If you need real-time injury alerts later, add a Twitter list or pay a service.
  5. Build the model using closing lines as your training target. Backtest against opens. Compute CLV per bet, not win rate. Track CLV monthly.
  6. When you outgrow free tier: upgrade to Starter or Pro. Most retail +EV operations top out around $20/mo for data costs total.

FAQ

What is the cheapest sports betting data API?

ParlayAPI's free tier (100,000 credits / month) covers most retail use cases at $0. Beyond that, ParlayAPI Starter at $5/mo or The Odds API's lowest paid tier at ~$30/mo are the cheapest options with usable latency. Avoid anything that does not let you test the free tier first.

Is there a free sports betting odds API?

Yes. ParlayAPI free tier (100K credits/mo, 26+ books). The Odds API free tier (500 requests/mo, polling-only). For historical only, sportsbookreviewsonline.com (Excel / HTML files), football-data.co.uk (soccer CSVs), and nflverse / hoopR packages on R and Python.

How do I get historical NBA betting odds?

For 2017 forward, hoopR (R / Python). For 2024 forward with multiple US books per game, ParlayAPI's /v1/historical/sports/basketball_nba/closing-odds endpoint returns 7+ books per closing line. SBR has older NBA seasons in HTML tables but coverage drops after 2022.

Where do I get player prop data for sports betting models?

ParlayAPI's /v1/sports/{sport}/props endpoint returns props from 13+ books and DFS apps including PrizePicks, Underdog, Sleeper, Pick6, Betr, Fliff, Pinnacle, DraftKings, FanDuel. Closing lines for player props specifically: /v1/historical/sports/{sport}/closing-odds?markets=player_*. Coverage starts April 2026 forward (when prop closing-line archival began).

What is CLV in sports betting?

Closing Line Value. The implied-probability difference between the price you got and the closing line of the same market. Positive CLV is the strongest single predictor of long-term betting profit, more reliable than win rate over small samples. Track in implied probability points, not in cents, so it is comparable across odds formats.

What latency do I need for live sports betting?

Depends on the model. Pre-game scanners are fine with 60+ second cadence. In-play models need 5-15 seconds. Steam-chasers and arb scanners need sub-5 seconds. Anything sub-1 second requires direct book feeds, not aggregator APIs.

Can I use a sports betting API for free?

Yes. ParlayAPI's free tier (100,000 credits / month, no credit card required) covers most retail model use cases. Polling NBA + MLB + NHL on a 60-second cadence stays well within budget. Historical archive included with 48-hour query depth on free.

Drop your stack in the comments

Curious what the actual readers here are running. Free tier only? Mix of paid + free? Got a clever combo I should be using? The good ideas in this thread will end up in v2 of this guide.

reddit.com
u/JacobTheBuddha — 5 days ago

What Shipped This Week (May 2-9): sub-2s in-play, sandbox tier, new docs, fraud caps, more

Big infrastructure week. Catching up the subreddit on what's new.


**Customer-facing:**


- 
**`/v1/sandbox/*` endpoints**
 — synthetic data, no auth, IP rate-limited. Test our response shape and timing without paying or even signing up. Useful during off-hours when no live games are running. [docs](
https://parlay-api.com/docs#sandbox
)
- 
**`/v1/sports/{sport}/live/source-health`**
 — per-source freshness diagnostic. Polls every 30s in your bot to detect when a feed goes stale, so you don't trade on dead data.
- 
**WNBA play-by-play**
 — ESPN-sourced, 5-10s end-to-end, same `/v1/sports/basketball_wnba/live/sse` shape as NBA.
- 
**SSE PBP now includes player names + scores**
 — earlier the trigger only sent event_type. Fixed; team_or_player_a/b, score_a/b, full description all flow through SSE now.
- 
**Concurrent SSE/WS connection caps per tier**
 — 1 (free), 3 (starter), 25 (pro), 100 (business), 1000 (enterprise). Stops abuse, keeps the pipe healthy for everyone.
- 
**Sub-second WebSocket frame capture**
 for sportsbook in-play state — DK / FD / Pinnacle / bet365 sources now all run a parallel WS-frame layer that catches push events the REST refetch misses. Verifying parsers against live games this weekend.
- 
**Pinnacle period_odds polling tightened**
 from 4s to 2s — captures more intermediate values during fast scoring runs.


**New documentation:**


- [Streaming docs](
/docs/streaming
) — unified SSE + WS reference with per-tier caps
- [Webhooks docs](
/docs/webhooks
) — full reference with HMAC signature verification examples (Python + JavaScript)
- [Migration from The Odds API](
/docs/migrate-from-the-odds-api
) — drop-in compatibility, savings calculator
- [API versioning policy](
/docs/api-versioning
) — formal deprecation contract, /v1 stability guarantee
- [vs/the-odds-api](
/vs/the-odds-api
) — side-by-side with annual savings calculator (15-20x cheaper at most volumes)
- [vs/oddsjam](
/vs/oddsjam
) — honest take, when to use which
- [vs/sportsdataio](
/vs/sportsdataio
) — honest take, different buyers
- [/built-with](
/built-with
) — projects customers are shipping with the API. Want yours featured? DM me.


**SDKs:**


- 
**JavaScript SDK published**
 — `npm install parlay-api`. Drop-in compatible with the-odds-api JS clients, with extensions for prediction markets, DFS, PBP, period markets, plus async iterators for SSE / WS streams. Built-in math helpers (devig, Kelly sizing).
- Python SDK already on PyPI: `pip install parlay-api`


**Internal infra (less interesting but might affect uptime):**


- 3-tier failover Worker probe tightened from 30s to 5s
- Cloudflare edge cache for static pages — marketing site stays up even if M4 origin blips
- Cloudflared tunnel restart Slack alerts (so I notice if it cycles)
- Daily backup verified working (645-702 MB nightly, 3-day rotation)
- Discovery scripts moved to TCC-safe path (was hitting macOS Operation-not-permitted)
- Fraud detection on signup — disposable-email blocklist + 3-signups-per-IP-per-24h cap


**Coming soon:**


- Annual prepay 15% discount (Stripe coupon setup this week)
- Pay-as-you-go tier for occasional / WS-curious users (per-call pricing)
- Slack bot interface for me (so I can interact with the API + CRM from my phone)
- Verified sub-2s state-change PBP across all major US sports (currently flowing on tennis, finalizing DK/FD/Pinnacle SPA capture)


**As always:**
 drop questions, requests, or bugs below. I read everything. Most user-requested features ship within a week or two if they're scoped reasonably.


---
reddit.com
u/JacobTheBuddha — 5 days ago

Three Deal-Breaker Questions before paying any Sports API in 2026

Quick test:

  1. Does the entry tier include Pinnacle? If no, walk. EV math doesn't work without sharp lines.
  2. Is there a real free tier with the same data shape as paid? If "free" gives you broken or fake data, you can't validate before committing. Walk.
  3. Do you have prediction markets and DFS-style books in the same feed? These are 2026 markets, not 2018. Aggregators that don't carry them are quietly behind.

ParlayAPI passes all three on Starter ($5/mo):

  1. Pinnacle on Starter: yes
  2. Free tier with full data shape: 1,000 credits/mo
  3. Polymarket, Kalshi, PrizePicks, Underdog, Sleeper, Dabble, ParlayPlay, Pick6, all in /v1/odds

If you're paying $30+/mo for an aggregator that fails any of these tests, you're paying a premium for the wrong tool.

reddit.com
u/JacobTheBuddha — 6 days ago

Why I built ParlayAPI for my own bot first

A little behind the scenes story you don't always get: I run my own EV scanner. Things I needed and couldn't find on aggregator APIs:

  • Pinnacle on the cheap tier (most aggregators gate Pinnacle on Pro+)
  • Prediction markets (Polymarket, Kalshi) in the same feed as sportsbooks
  • DFS-style apps (PrizePicks, Underdog) carried as proper book sources
  • Play-by-play to drive live-betting triggers
  • Per-call pricing that doesn't punish high-frequency polling

Built ParlayAPI initially as my own internal aggregator with each integration done from scratch. Worked well enough that other people asked for access. So now it's a public API.

I still use it as my own EV scanner backend. $5/mo Starter covers 95% of my usage. The product I sell is the product I bet with.

That's not normal. Most sports data vendors have never placed a bet and built their product based on what enterprise procurement teams ask for. ParlayAPI is built around what I wanted as a bettor.

reddit.com
u/JacobTheBuddha — 6 days ago

If you're on The Odds API at any tier, here's the math on switching

Side-by-side from each vendor's public pricing page:

Tier The Odds API ParlayAPI
Free 500 / mo 1,000 / mo
Entry paid $30 / mo for 20K (= $1.50/1k) $5 / mo for 50K (= $0.10/1k)
Pro $59 / mo for 100K (= $0.59/1k) $30 / mo for 1M (= $0.03/1k)

15x cheaper at entry. 20x cheaper at Pro. Same major US books at every tier. PLUS Pinnacle (TOA gates this on higher tiers), Bovada, Kambi-network books (Unibet, PMU, BetRivers, Hard Rock), Polymarket, Kalshi, every DFS-style book, plus play-by-play across NFL/NBA/MLB/NHL/soccer/MMA/tennis.

Migration: change https://api.the-odds-api.com/v4/... to https://api.parlay-api.com/v1/.... Response shape is identical, drop-in compatible. Done.

If you're on TOA and your model doesn't need a market we haven't indexed, you're paying middleware tax for no reason. (just DM me and I'll add the market you need)

reddit.com
u/JacobTheBuddha — 6 days ago

"Real-time" means 4 different things in sports data. Verify before you trust.

Every vendor advertises real-time. They mean different things:

  1. Sub-second push (true real-time): WebSocket subscription, events at ~100-500ms from source. Polymarket's CLOB. Rare on retail tiers.
  2. Polled real-time (1-3s): vendor polls upstream every 1-2s, fastest at retail tier.
  3. Cached real-time (5-15s): vendor polls every 10-30s, caches. Most aggregators.
  4. Real-time-when-it-matters (varies): vendor polls fast during marquee events, slow otherwise. Ratio swings.

For live-betting bots, 1s vs 10s is the difference between profit and loss.

Verify your source's actual cadence: hit an endpoint on a stopwatch during a live game, watch how often the response actually changes. Don't trust marketing copy.

reddit.com
u/JacobTheBuddha — 6 days ago

Never use synthetic odds for sports betting backtesting. Here's what to use instead.

Common mistake: no historical-odds for some date range, so you generate "synthetic odds" from win percentages or Elo. Reasonable-sounding. It will lie to you.

Synthetic odds are smooth. Real odds are not. Real lines move on injury news, weather, sharp money, public hype. They have variance, errors, arbitrage windows. Backtesting on synthetic odds tells you how your strategy performs in a fictional universe where lines are always perfectly informed.

Specific failure modes:

  • Line-movement-timing strategies show fake edge (synthetic data has no movement)
  • Cross-book arb strategies show fake edge (no per-book noise in synthetic)
  • EV-vs-soft-book strategies show fake edge (synthetic = sharp by construction, soft books look much worse than reality)

What to use instead:

  • Real historical odds wherever available (we have ~10 years for major US sports, including pre-game and closing for MLB / NFL / NBA / NHL / soccer)
  • Walk-forward validation (train weeks 1-10, test week 11, advance)
  • Out-of-sample testing on data the model has not seen
  • Realistic execution costs: include vig, include slippage, include account-limit risk

If your synthetic-odds backtest claims 1000% ROI over 5 years, it's wrong. Always.

reddit.com
u/JacobTheBuddha — 6 days ago

The Odds API charges 6x what ParlayAPI does for the same data. Real math.

Pulled this from each vendor's public pricing page (May 2026). Verify before relying on it.

The Odds API:

  • Free: 500 requests/month
  • Plus: $30/month for 20,000 requests = $1.50 per 1,000 requests
  • Pro: $59/month for 100,000 requests = $0.59 per 1,000 requests

ParlayAPI:

  • Free: 1,000 credits/month (2x their free tier)
  • Starter: $5/month for 50,000 requests = $0.10 per 1,000 requests
  • Pro: $30/month for 1,000,000 requests = $0.03 per 1,000 requests

At the entry paid tier, that's 15x cheaper per call. At Pro, 20x cheaper.

What you get for the lower price:

  • Same US books (DraftKings, FanDuel, BetMGM, Caesars, Pinnacle)
  • Plus Bovada, Kambi-network books (Unibet, PMU, BetRivers, Hard Rock), prediction markets (Polymarket, Kalshi), exchanges (ProphetX), DFS books (PrizePicks, Underdog, Sleeper, Dabble, ParlayPlay, Pick6) — all of which The Odds API does not carry
  • Cross-sport play-by-play (NFL, NBA, MLB, NHL, soccer, UFC, tennis) — also not on The Odds API
  • Drop-in compatible URL/schema, swap is a one-line change

The price gap isn't because we're new and cheap-by-necessity. It's because we built direct integrations instead of layering on top of someone else's aggregator. Different cost structure for us, same data for you.

If you're paying $30+/month for a sports odds API and not getting prediction markets, DFS books, or play-by-play, you're paying middleware tax. There's a cheaper option that does strictly more.

Parlay API

u/JacobTheBuddha — 6 days ago

You suck at AI prompting (and a one-line fix)

Most people prompt AI like they're searching Google. "Write me a script to do X." Then they're surprised the output is mediocre.

Models are roleplay engines. They write better when they think they're a competent expert.

Compare:

Bad: "Write me a Python script to detect arbitrage between sportsbooks."

Better: "You are an infinite super-genius quantitative trader who loves sports betting and has built arbitrage scanners for a decade. Assume the reader has equal or greater capability. Write a Python arbitrage detector."

The first one gives you a Stack Overflow answer. The second one handles edge cases you didn't think to ask about (max bet limits, line cancellations, two-way devigging) because the model thinks it's writing for a peer.

We use this exact pattern internally. Plug in the domain ("infinite super-genius bettor who loves algorithmic betting") and output quality jumps.

It's free, it works, most people don't do it.

reddit.com
u/JacobTheBuddha — 6 days ago

What is positive EV betting (and how to actually find it)

Positive EV betting is the math version of "this bet pays more than its true odds suggest." If a coin flip is 50/50 and someone offers +110 on heads, that's +EV: you lose half the time but win 110/100 the rest, averaging a 5% edge.

The hard part is figuring out the true odds. Sportsbooks don't post them. You either model the game yourself or use a sharp book's price as a proxy. Pinnacle is the standard proxy because they take sharp action and adjust aggressively.

Workflow:

  1. Pull odds for the same market from Pinnacle and from a soft book (DK, FD, MGM, Caesars).
  2. Convert both to implied probability.
  3. Devig Pinnacle's two sides so they sum to 100%.
  4. Compare to the soft book. If the soft book's implied probability is lower than Pinnacle's devigged true probability, that's positive EV.

Math:

  • American +150: implied = 100 / (150 + 100) = 40%
  • American -150: implied = 150 / (150 + 100) = 60%

ParlayAPI gives you both books in one call so writing this scanner is mostly format-shuffling. Free tier (1,000 credits/mo) handles a small EV scanner across NBA/MLB/NFL/NHL during their seasons.

reddit.com
u/JacobTheBuddha — 6 days ago