r/Sabermetrics

MLB division standings display
▲ 89 r/Sabermetrics+3 crossposts

MLB division standings display

My GitHub repo fetches live MLB NL West standings via the MLB Stats API and composites them onto a background image with team pennants, W-L records, and games-back figures. The renderer outputs a 960×1280 PNG to a GitHub Pages-hosted public/ folder, making the image accessible over HTTP as a simple static URL. The reTerminal polls that URL on a schedule to refresh the display — no server required.

u/Dave-356w — 12 hours ago
▲ 47 r/Sabermetrics+1 crossposts

Hey everyone,

I recently finished building THE NINE — not just the app, but the full workflow around it — and I’d really appreciate some honest feedback from people who work with game data.

I’m not trying to sell anything here.
I’m trying to answer one question:

Is it immediately clear what this actually does and what it requires?

The problem I’m trying to solve

After a game, everything is scattered:

  • video
  • pitch data (TrackMan / similar)
  • lineup / roster
  • notes, reports, clips

Even for teams that do have data, there’s no clean way to connect everything into one review workflow.

What the system does

You give it:

  • full game video
  • lineup / roster
  • pitch-by-pitch CSV (TrackMan or equivalent)

And it turns that into one structured package:

  • full logged game (pitch-by-pitch)
  • synced video clips
  • play-by-play + box score outputs
  • pitch data exports
  • player reports + review views
  • a read-only review app + portal access

What I’m trying to understand

If you open the site for 30–60 seconds:

👉 Is it clear what the system needs from you?
👉 Is it clear what you get back?
👉 Or does it feel like it requires more than it actually does?

Site: https://the-nine-app.live

I’m especially interested in critical feedback — if something is confusing or feels like overkill, that’s exactly what I need to hear.

Thank you all.

u/LegitimateAdvice1841 — 4 days ago
▲ 15 r/Sabermetrics+1 crossposts

An early look at each qualified hitter's plate discipline (K-BB%) and extra-base hit power (ISO)

u/ritmica — 5 days ago

Bootstrap on my first 421 picks: 88% confidence of long-run +ROI, but I'm 42.8% straight up. What am I missing?

Spent the last few months building a probabilistic prediction model for NBA and MLB game outcomes. Standard hobbyist stack: Elo + recent form + injury drag + pitcher-level priors for MLB + line-movement signal + per-sport calibration shrink. Outputs a calibrated p(side wins) for each market.

Yesterday I finally ran proper validation on 421 settled picks and the result is interesting enough I want to ask for methodology critique.

**The headline tension:**

* Raw hit rate: 42.8% (n=421, Wilson 95% CI [38.1%, 47.5%])

* Sounds bad. Standard -110 breakeven is 52.4% so naive read is "model is losing."

* But mean decimal odds taken is 2.94 (model picks a lot of dogs and small parlays), so actual mix breakeven is 42.4%.

* Bootstrap on actual P/L (1000 resamples, 1u stakes): mean ROI +8.6%, 95% CI [-5.4%, +22.4%], P(ROI > 0) = 0.885.

Per sport:

* MLB n=322: hit_rate 44.7%, breakeven 43.9%, bootstrap mean ROI +6.65%, P(>0) = 0.798

* NBA n=94: hit_rate 38.3%, breakeven 37.9%, bootstrap mean ROI +19.94%, P(>0) = 0.851

So the bootstrap is saying long-run +EV is more likely than not, but I'm at the sample size where confidence intervals on ROI still cross zero. The "I'm losing because hit rate is below 50%" naive read is misleading because the bet mix has different breakevens.

**The validation finding (the actual question):**

I bucket every pick into confidence tiers based on (model_p, fanduel_edge). The CLV-aware data on the top tier surprised me:

* Top tier (n=108 settled, 5 with closing-line data): 100% beat the closing line, +21.27pt avg CLV, +24.56% bucket ROI

* Middle tier (n=199, 19 with CLV): 73.7% beat-close, +1.46pt avg CLV, +8.06% ROI

* Auto-parlay tier (n=86): 25% hit, -18.81% ROI. This is broken. Generation thresholds were too loose.

The high-confidence tier is doing real work: 100% beat-close (small sample but consistent direction) plus +21pt CLV says the model is picking the sharper side of the market on its strongest signals. The auto-parlay tier is hemorrhaging because parlay miscalibration compounds multiplicatively while my per-sport calibration shrink is tuned for singles.

**What I'd love methodology feedback on:**

  1. **Per-tier-vs-parlay calibration.** I shrink model_p toward 0.5 based on per-(sport, market_type) historical hit-rate gaps. Singles are well-calibrated. When I multiply N calibrated leg probabilities to get a parlay prob, miscalibration compounds and the parlay prob is consistently overstated. Has anyone solved this cleanly: leg-level Platt scaling tuned specifically for parlay use, hierarchical Bayesian per-leg priors, something else?

  2. **CLV stamping coverage.** I currently have closing-line data on only 24 of 421 settled picks because the snapshot loop wasn't reliably running for the first months. Going forward every new pick gets stamped automatically. Should I weight calibration adjustments toward CLV-validated rows even at small n, or wait for more data?

  3. **Bootstrap interpretation.** With P(ROI > 0) = 0.885 and 95% CI crossing zero, what's the responsible way to communicate this externally? "Probably profitable" feels honest but is harder to falsify than a Sharpe-style number. Curious how people working on similar discrete-outcome prediction systems frame their confidence.

Open-book journal where every pick before kickoff is logged and graded automatically against ESPN's scoreboard. Happy to share the link in a comment if useful for context; not the point of the post.

reddit.com
u/mangoman40114 — 17 hours ago
▲ 48 r/Sabermetrics+1 crossposts

I'm a data analyst by day. About 18 months ago I got tired of losing on props by going with my gut, so I started treating it like a work problem. Built a Postgres database that ingests box scores via the NBA stats API, PrizePicks lines from a scraper I wrote, and rotation data from a combo of the NBA's hustle stats endpoint and pbp stats. Everything is timestamped and versioned so I can re-run any historical window.

The dataset: 412 regular season games from Nov 2024 through April 2025, plus the same window for the 2023-24 season for validation. Every starter and 6th man. Points, rebounds, assists, 3PM, and steals+blocks. That's roughly 4,800 player-game rows per season.

Here's what held up across both seasons.

Edge 1: High-usage guards on back-to-back unders (PTS and AST)

I defined "high-usage" as >26% usage rate per Cleaning the Glass. Then I filtered for guards playing their 2nd game in 2 nights where they played >30 min the night before.

2023-24 season: 87 qualifying player-games. Under hit on points at 58.6%. Under hit on assists at 61.2%. Average line on points was 22.4, average actual was 19.1. That's a -3.3 delta.

2024-25 season: 91 qualifying player-games. Under on points: 56.0%. Under on assists: 59.3%. Average line 22.8, average actual 20.0. Delta: -2.8.

The edge compressed slightly year over year but stayed significant. For context, a 57% hit rate at -110 implies a 4.5% ROI. Over a season with maybe 2-3 of these spots per week, that's ~60 bets. At 1 unit each, you're looking at +2.7 units on average. Not life-changing, but it's free money if you're disciplined.

The mechanism is pretty obvious when you think about it: these guys are running the offense, carrying the ball up, taking the tough shots. On night 2 after 32+ minutes of that, the legs go first. Shot velocity drops. They settle. Assists dry up because they're not driving and kicking as hard. The books shade maybe 0.5 points from the normal line but the real performance hit is 2-3x that.

Specific example: Ja Morant, Dec 14 2024 (2nd night of B2B after 34 min vs IND). Line was 24.5 points. He put up 16 on 6-of-17 shooting with 4 assists (line was 7.5). Under both by a mile. This pattern repeated for Shai, Fox, Maxey, Brunson. The only guys who seemed immune were LeBron (he's a freak) and occasionally Luka (who will literally shoot his way into volume regardless of fatigue, but his efficiency tanks).

Edge 2: Rest-advantage overs for big men (REB only)

This one surprised me. I expected rest advantage to matter more for guards given the running, but the rebounding edge for well-rested bigs was actually cleaner.

Filter: Centers and PFs with >24 min/g, coming off 2+ days rest, facing a team on a B2B. Rebounds line only.

2023-24: 104 qualifying games. Over hit 54.8%. Average line 9.2, average actual 10.1. Delta +0.9.

2024-25: 98 qualifying games. Over hit 56.1%. Average line 9.4, average actual 10.4. Delta +1.0.

Why this works: When the opponent is on a B2B, their guards are slower getting back in transition, their bigs are slower to box out, and there are more live-ball rebounds available in general because shooting percentages drop on B2Bs too. The well-rested big feasts on the chaos. It's not that he's playing better, it's that the environment creates more available rebounds.

I watched this play out in real time with Domantas Sabonis on March 3, 2025. Kings had 2 days rest. Hawks were on a B2B. Sabonis line was 11.5 rebounds. He grabbed 19. Wasn't even close. The Hawks bigs looked like they were moving in sand.

Edge 3: The 0.5 point line move signal

I tracked every prop line from open to close for the 2024-25 season using 15-minute snapshots. When a player prop line moved 0.5 points or more from open to game-time close, the direction of the move correlated with the result at 59.3% across 1,240 qualifying moves.

That number is absurd if you think about what it means. The books are adjusting because sharp money came in, and that sharp money is right almost 60% of the time. If you could just ride the coattails of line moves that size, you'd have a 7% edge at -110 without doing any analysis of your own.

The problem: detecting the move requires checking the line multiple times between open and close. I automated it. If you can't automate it, set a reminder to check PrizePicks and DraftKings at open and then again 90 minutes before tip. If the line moved 0.5+, ride it. If it didn't, pass.

One important caveat: this edge is stronger on totals and spreads than on player props specifically. On player props the sample is smaller and the noise is higher. But the direction holds.

What doesn't work (despite what you've heard):

Home/away splits: I ran a paired t-test on every starter's home vs away performance. Out of 143 qualifying players, 21 had a statistically significant difference (p < 0.05). That's 14.7%. Almost exactly what you'd expect by random chance at a 0.05 threshold. The "home court advantage" for individual player props is largely a myth.

"Trending" overs/unders: A player going over 4 out of 5 games has zero predictive value for game 6. I checked. The over rate for players coming off 4+ overs in their last 5 was 51.2%. That's coin flip territory. Recency bias is the single most expensive cognitive error in prop betting.

I'm happy to share the SQL queries or the schema if anyone wants to replicate this.

reddit.com
u/Fancy-Tadpole-2448 — 8 days ago

Hi everyone. I'm developing a Python ETL pipeline to feed a predictive Machine Learning model (XGBoost) for MLB.

It's worth noting that I'm a beginner at this. I have some background because I'm studying systems engineering, but I'm building this almost entirely through "vibe coding." This is my first time building a prediction system.

Currently, I'm using Python and SQLite. My automated pipeline already extracts raw physical data from Baseball Savant/Statcast (allowed xwOBA, Barrel%, K%, BB%, etc.) and merges it with scheduled games using StatsAPI. I've already solved the lookahead bias by using a strict backward pd.merge_asof, ensuring the model only sees metrics available the day before the game. The base model is already running, evaluating hitting, splits, and Park Factors.

The Problem: To improve my model's Brier Score and Log Loss, I need to inject the full spectrum of advanced pitching metrics (all variables from the 'Advanced', 'Batted Ball', and 'Plate Discipline' dashboards, including SIERA, FIP, xFIP, LOB%, SwStr%, K-BB%, etc.). I need this bulk extraction at two levels: individual starters and grouped by team (to isolate the collective performance of the bullpen).

FanGraphs is the standard source for these consolidated dashboards, but I've hit a hard technical roadblock:

  • Direct export of CSV files is locked behind their premium subscription (FanGraphs+).
  • I tried extracting the data by directly consuming their backend API (JSON endpoints) passing the splits and dates parameters, but their anti-bot system (Cloudflare) constantly throws a 403 Error.
  • To bypass Cloudflare, I implemented cloudscraper and then tried TLS Spoofing using the curl_cffi library (impersonating Chrome 120), but the server still rejects the connection or data request due to lack of authentication.
  • I also tried using the pybaseball library (pitching_stats), but it breaks or fails when trying to extract short daily date ranges and specific bullpen splits in bulk.

What I'm looking for: Since I want to maintain the script's automation without relying on a manual "copy-paste" process for tables, or paying hundreds of dollars for a commercial API, I'm looking for your technical recommendations:

  1. Do you know of any specific headers/cookies configuration, or any Python scraping tool that is currently successfully bypassing FanGraphs' Cloudflare for bulk data requests?
  2. Is there a robust alternative source (free API or less protected website) where I can automate the daily download of all these sabermetric pitching metrics?
  3. Alternatively, does anyone have experience or a reference repository calculating this entire block of advanced metrics (SIERA, FIP, xFIP, etc.) locally in SQLite/Python using only raw play-by-play (Pitch-by-Pitch) data from Statcast/Retrosheet? (I have some of the formulas, but calculating the league constant coefficients on the fly for the entire pool of metrics seems error-prone and computationally expensive).

I'd appreciate any guidance on data architecture, evasive scraping techniques, or applied sabermetrics.

reddit.com
u/gsus_21 — 10 days ago

Or, better yet, what's the BABIP for each situation: Varying depending on how many outs and which bases are occupied?

I feel like this is a calculation that's been key to the Brewers' success: the understanding that hits are way more likely when the infield is in, and so they've built a team that creates situations that bring the infield in with speed and a contact-first approach.

reddit.com
u/TheLostPariah — 13 days ago

I’ve been building a site that lets you log games you attended and then see aggregated player stats from the games you saw live. It’s less fantasy and more personal game-history tracking. I’d genuinely love feedback on what stats or filters would matter most to serious baseball stat people. https://gamedaychasers.com

reddit.com
u/mycahgr — 11 days ago