r/algobetting

I built a predictive model for football match stats (shots, corners, fouls) across 20,000 matches. The strongest predictor ended up being ELO from chess. [OC]

For the past few months I've been working on a personal project: a predictive model for per-match football statistics. Not the final score, but the behaviors: how many shots each team will take, corners, fouls, cards. The dataset covers around 20,000 matches across five seasons and the top 5 European leagues.

I started with hundreds of variables: rolling shot averages, foul rates, corner frequencies, home/away splits, opponent profiles. Everything you'd expect. The first results were decent, but the model was essentially regressing toward each team's historical mean without any real understanding of match context. It could see that Team A averages 14 shots and Team B averages 11, but it had no concept of the gap between the two sides. It didn't know that tonight Team A is so much stronger they'll pin Team B in their own half for 70 minutes and probably end up with 19 shots while Team B scrapes together 6.

Historical averages are built against opponents of all quality levels. They encode nothing about the specific match being played, and that contextual read is exactly what every football fan processes automatically before kick-off. The hard part is giving a model a number for something so intuitive.

I ended up turning to chess. ELO ratings were invented in the 1960s by Arpad Elo to classify players more precisely than tournament standings alone. Beat someone stronger and your score rises significantly; lose to someone weaker and it drops. It updates after every game, with the only inputs being the result and the relative strength of the two players — no performance quality, no expected goals, just who won and against whom.

I built an ELO system for all clubs across the top 5 leagues, initialized from external sources and updated match by match through five seasons. When I added the ELO gap between the two teams as a predictor, things shifted immediately.

Bivariate Spearman correlation with shots:

Predictor	Correlation
ELO gap	0.377
Rolling shot average	0.273

The chess number outperformed every football-specific variable in the model. And when you break it down by bucket, it's obvious why:

ELO gap	Avg shots
< −200 (much weaker)	9.2
−200 to −100	10.5
−100 to −50	11.0
±50 (balanced)	12.8
+50 to +100	13.0
+100 to +200	14.4
> +200 (much stronger)	17.4

Global average: 12.7 shots

From 9.2 to 17.4 driven entirely by the strength gap — and no rolling average captures it, because rolling averages don't know who those shots were taken against. A team that faced three weak sides in a row will have inflated numbers; the ELO gap adjusts for that automatically.

200 variables, five years of data, six leagues, and the most important feature had nothing to do with football.

Happy to get into the methodology or the initialization choices in the comments.

u/Agalex97 — 1 day ago

▲ 0 r/algobetting+1 crossposts

Would I be considered a sharp?

Attached are my betting results tracked by Pikkit. I started betting on sports seriously with the intention of making serious profit when I turned 18 in May of 2025 and was wondering if I would be considered a Sharp and if not what I can do to be considered one.

I know my sample size is relatively small only a bit less than 350 bets placed but was wondering.

For context, I mainly bet on UFC and Boxing for like 95% of my bets and in 2025 was mainly betting using DFS apps like Underdog, PrizePicks, Sleeper, etc. But now I feel like those apps have become more efficient when it comes to their lines for the UFC and Boxing so now I mainly just bet on moneylines in 2026 via apps like Kalshi.

Finally, I want to become a Quant Trader at a HFT firm and was wondering how marketable this would be for this assuming I scale it up more and if so how I should word it.

u/Initial-Web4015 — 3 days ago

▲ 4 r/algobetting

"Knowing a sport" absolutely matters for profitable pre-game betting & CLV.

There is this myth floating out there that "knowing a sport" is irrelevant to be a profitable better. It is true for top down strategies like arbitrage, but for any kind of originating where you are truly beating the market bottom up(like getting CLV on liquid main lines on Pinnacle for NFL/Soccer/NBA) it absolutely matters. The best bettors who get constant CLV in mainlines in major leagues have a way to price something(models, etc), but they also absolutely rely on discretionary skills like understanding injuries, market dynamics, precedents, etc. And frankly a person who just uses a model for something like the NBA pre-game has zero chance to beat a person using both a model + qualitative skills even if they have the best model in the world. And lastly, when I say "knowledge" its more than just knowing the sport or trying to predict a winner, so this includes stuff like market dynamics, patters, how books work, injuries, intuition, sentiment of injuries, etc.

Metric	Value
Total Bets	39
Wins	4
Losses	35
Win Rate	10.3%
Loss Rate	89.7%
Total Invested (losses only)	$601
Total Won	$2,315
Total Lost	$601
Net Profit	+$1,714
ROI	+$1,714 / $601 = ~+285%

r/algobetting

I built a predictive model for football match stats (shots, corners, fouls) across 20,000 matches. The strongest predictor ended up being ELO from chess. [OC]

Would I be considered a sharp?

"Knowing a sport" absolutely matters for profitable pre-game betting &amp; CLV.

What is the best websocket odds provider?

Using Pinacle Live Markets

Should I pivot from arbitrage to value bets?

I built a stacking ensemble for football Over/Under markets across 8,200 bets. ELO gap turned out to be the strongest single predictor. [OC]

sportbook provider

Fast, affordable API with live info about a game?

Real-time Pinnacle odds via WebSocket

Pushed a redesign + new evaluation views to my free WTA prediction site

Early Payout Offer. Is anyone else exploiting the 2UP?

How hard is it to bet opening lines for main markets on major sports

I built riftcast.gg , a completely transparent ML prediction system for League of Legends Esports - feedback appreciated

"Knowing a sport" absolutely matters for profitable pre-game betting & CLV.