Live automated Betfair value betting system — looking for serious collaborators
I've built a full-stack profitable automated horse racing value betting system - looking for serious collaborators
Long post but I want to give enough detail to attract people working at this level rather than just getting generic replies.
---
**What I've built**
Started this as a uni project (Mechanical Engineering, so this is self-taught on the ML and python side) and it's grown into something I want to take seriously. The system runs end-to-end:
- **Data:** Mix of publicly available form data, race conditions (going, distance, class, field size), market data (BSP, pre-race odds), and jockey/trainer stats. Currently the ingestion side is semi-automated - I do manual downloads that feed into a fully automated processing and feature engineering pipeline.
- **Model:** Random Forest classifier to identify winners.
- **Staking:** Kelly criterion implementation to size bets based on perceived edge. Log regression of all previous bets to estimate win probability of potential bets and size them accordingly. Also do a flat staking on BSP.
- **Execution:** Full Betfair API integration - once a race qualifies, bets are placed automatically. The full cycle from form data to live bet placement runs in under 5 minutes. I also have a twitter account that tweets my bets in the morning directly from python to prove legitimacy.
Getting the Betfair API genuinely functional and reliable was a significant challenge and I consider a clean, fast end-to-end integration one of the main thing I have to offer a collaboration.
---
**Where I'm at and what I want to improve**
The system is live and running on UK racing. I'm still in the performance evaluation phase - I'm not here to claim I've cracked it, but the infrastructure is solid and working. At BSP flat stakes my tweets returned a 30% ROI. And I took £500 up to a monthly betting revenue of £9900.
The two areas I know are weakest:
**Probability calibration** - my RF outputs raw probabilities that I use for value identification, but I haven't built a proper calibration layer. I suspect this is costing me on edge calculation. I've already got a neural net in development.
**Feature engineering** - I'm reasonably confident in the features I have but I don't have strong conviction that I'm capturing the right signals, especially around market dynamics, identifying the correct races in the morning (as there are later non runners).
**New Edges** - I want to develop other strategies, like identifying horses whose prices are likely to move in, and buying them in the morning and selling those before the off for a guaranteed return.
---
**What I'm looking for**
I want to talk to people working on the same or adjacent problems who are actually in the game - live execution, real models. Not interested in theoretical discussions or people earlier in the journey.
Specifically interested in:
- **Commercial potential** - open to exploring what a more serious joint project could look like.
- **Adjacent sports** - In horse racing I will share a guarded amount of my tech, if you are working on an adjacent sport I would share almost all of it provided there is mutual benefits.
Drop a comment or DM if any of this resonates. Happy to go deep on a call into the technical details with the right people.