
I trained an AI to speedrun Super Mario Bros using Reinforcement Learning — after more than 6 million deaths 😅
The agent starts completely clueless:
- running into the first Goomba
- falling into pits
- getting stuck against pipes
Over time, it slowly learns:
- movement timing
- enemy avoidance
- jump precision
- speed optimization
What’s interesting is that some “speedrunner-like” behaviors emerged naturally during training:
- maintaining momentum
- minimizing hesitation
- optimizing jump timing
The training was done using a custom RL setup with frame stacking and temporal modeling. Watching the progression from random movement to competent gameplay was honestly one of the coolest parts of the project.
I’d love feedback from people into:
- RL
- game AI
- imitation learning
- emergent behavior