Is It November yet?
I miss the action. Who are you guys most excited about watching next season?
I miss the action. Who are you guys most excited about watching next season?
Quick update for anyone tracking this. Just shipped a layout overhaul and a few new model-evaluation views on datadrivenpicks.club.
The model itself: point-level Markov chain (p_in / p1w_serve / p2w_serve per player per match) with surface and recent-form adjustments. v3 launched a couple weeks ago and stacks an XGB winner-blend on top + separate XGB regressors for game margin and total games, trained on ~19k WTA + Challenger matches.
What's new on the site:
Current track record (ML market, match-level dedupe to favored side):
Free, no paywall, no monetization. Performance page has everything public.
Site: https://datadrivenpicks.club Daily edges on Twitter: https://twitter.com/Gay4WTA
Open to feedback — anything that looks miscalibrated or over-engineered, I want to hear it.
I've been working on a forecasting project for women's hockey — currently 565 games across the PWHL (full play-by-play), IIHF Women's World Championship (box scores OCR'd from official PDFs across 7 tournaments since 2017), and Olympic women's tournaments (2018, 2022, 2026 via Wikipedia tables).
Per-league forecasting performance, time-based 80/20 split:
Approach is league-specific Elo (tuned via grid search — international K=100/divisor=150, PWHL K=50/divisor=400 since the leagues have very different talent stratification) plus a Poisson goal-totals layer for over/under analysis.
What I'm missing: any historical odds data for these competitions. Specifically:
Sources I've explored that don't work: the most popular sports-data API has zero women's hockey. Flashscore only retains the last live line, which isn't a true closing snapshot. OddsPortal has some pages but the coverage for these specific leagues is patchy.
If anyone has worked with women's hockey odds data, has an archive from a previous project, or knows of a less-mainstream source that covers these competitions, I'd be very grateful for pointers. Happy to share the dataset and methodology back.
I've scraped + modeled the entire bettable universe of women's hockey: PWHL, IIHF Women's Worlds, and Olympic women's tournaments. Some quick numbers on what's built so far:
Dataset (565 games):
Model performance, chronological 80/20 split, no leakage:
| Bucket | Test n | Accuracy | Log-loss | Brier |
|---|---|---|---|---|
| INTL (Worlds + Olympics) | 52 | 86.5% | 0.349 | 0.115 |
| PWHL | 58 | 56.9% | 0.658 | 0.234 |
The international gap (USA/CAN ~250 Elo above everyone else) is huge and the model captures it — calibration tight in the high-confidence bins (96% predicted → 100% actual, 76% → 75%, 5% → 0%). PWHL is essentially coin-flip-by-design — added rest, recent form, and goalie save% as features and they have the right sign in the coefficients but can't break the parity ceiling with only 245 training games.
I tuned Elo K + slope per league via grid search (INTL K=100, divisor=150; PWHL K=50, divisor=400 — 538-style approach). Also have a Poisson goal-totals model that's well-calibrated for PWHL (34% predicted ≈ 29% actual on over 5.5) but systematically over-predicts for INTL — which itself is interesting because it implies under bets on tournament games near the model's predicted total are likely mispriced toward bettors.
What I don't have: ANY odds data. I subbed to The Odds API and learned the hard way that they have zero women's hockey coverage (NHL/AHL/Liiga/SHL only — no PWHL, no IIHF Worlds, no Olympics).
The ask:
Looking for opening + closing lines on women's hockey games — moneyline, totals, ideally puck line if available. Specifically:
Anything helps:
Flashscore has odds but only saves the last live line, not the actual closing line, which limits its value for CLV analysis. OddsPortal has historical but I'm unsure on their coverage for women's hockey specifically.
Happy to share the model + dataset back once I get this last piece. Specifically can produce:
Thanks in advance to anyone who can point me in the right direction.
I've been working on a women's basketball prediction model — built it for one league by reverse-engineering an existing public model, now trying to apply the same approach to other leagues. WCBA is on my list and I've hit a wall on data.
What I have already:
What I need:
event/{id}/statistics endpoint mostly 404s on these older games (only 12-31% coverage).What I've tried:
Anyone have a CSV dump, a working scraper, or a tip on a source I'm missing? Even a partial historical export from a previous project would help. Use case is non-commercial.
Happy to share what I've already pulled if useful in trade.
Thanks!