
I built riftcast.gg , a completely transparent ML prediction system for League of Legends Esports - feedback appreciated
Hey everyone. I built https://riftcast.gg/, an ML prediction system for LoL Esports with both training stats visible and historic data tracked (if model predictions were correct or not).
The setup:
- 3,091 pro matches in the dataset across 272 teams and 43 tournaments (so far), covering all major regions (LCK, LPL, LEC, LCS) and minor regions
- Series-level predictions (pre-match) and game-level predictions (post-draft)
- Three models running in parallel:
- FastTree (free tier baseline, simplest features)
- LightGBM with patch/meta-aware features (tracks game duration trends, team performance gaps between recent patches and all-time, format interactions like is_bo5 * elo_diff, etc.)
- PCA Sweep — runs a 7000-config hyperparameter search for ~5 hours weekly, PCA-compresses the noisy draft features
- Plus a Consensus prediction combining all three
**Feature engineering:**
The series model uses ~80 features after filtering. Heavy use of:
- Differential features (Blue stat - Red stat) to avoid teaching the model side bias
- Decayed all-time stats + Diff5 rolling windows for recent form
- A custom Elo system with cross-league calibration (this is what handles international events, which only have ~20 games of historical data)
- Hand-crafted composite features (Diff_Composite_EarlyGame, _Combat, _Vision, etc.) to compress correlated signals.
The draft model adds champion-level features: per-lane Overall/Counter/ Mastery/Meta scores weighted by Samples confidence, synergy by lane-pair (Top-Jgl, Mid-Jgl, Bot-Sup), and per-lane "LaneEdge" composites.
I have an "Uncertain tag" which excludes prediction results for predictions with less than 55% certainty which is also shown in the UI for transparency
Accuracy across the last 2 weekly reports published (75 series, 209 games):
I also track each model's performance per league and show it on each upcoming match prediction. For example, Consensus (the aggregate from all models) is yet to make a wrong Series prediction for LCS (17/17 correct) and has a fairly good accuracy for Game (Draft) Predictions as well (28/35 correct)
Where I think it's weak:
- International events (~20 cross-region games in dataset) — Elo helps but cross-region calibration is shaky
- LightGBM volatility week-over-week (76%/70% vs 69%/80%) — patch-aware features may be over-correcting
Any feedback will be much appreciated, thanks!