Why the fuck does this work???
As context I am a data science student so very new to the quant space but I have been following it and learning about it since I was 14 and my parents have always been anti-stocks in a way saying that it's a waste of time.
When I was 15 I designed a kind-of quant strategy but didn't know how to backtest, tried implementing ML way too early and was using yfinance for data... After a lot of frustration, distraction and realising I had leapt in too early, I saved it to a thumbdrive and left it in my drawer. Now two years later I have built a couple of other projects (more focused towards intrinsic value trading or news trading), with some of them resulting in slightly higher-than-average Jensen's Alpha which was the main metric I focused on after I saw a lot of my models returned with a high Beta.
After making a reasonably successful mid-term model with 33% CAGR over 15 years I remembered my thumbdrive. Opened the file (magically not corrupted after 24 months of zero care) and laughed at the mockery of code I had produced. I rewrote the code with the same principle... instead of learning any kind of analysis my 15-year-old self decided the best thing to do was categorise the previous 21d, 7d, and 1d of returns into a bucket A, B, C, D or E. Then getting the returns of the next 1d, 7d and 21d and do the same. Do this over a big enough time (I did 7 years as I wanted to capture covid regime but didn't want to take too long as I thought this whole thing would be a waste of time), and that's it. All you have to do then is analyse a list of stocks now, capture the 3-letter code and probabilistically determine its future 3-letter code.
I obviously added to this with EV which helped me threshold to remove noise but for whatever reason this strategy is up 40% ytd. What the fuck. To be clear the data it was given only went up to December 31 2025. It's done better than any other model I've made and it's genuinely so stupid. It might be a regime thing but I genuinely don't know, and the amount of times it's predicted INTC, and its 50/50 of either a 2-5% loss or a 10-30% gain is actually insane.
Any ideas as to why this works. In 2 months I can start trading (in Australia you must be 18), so do I trade this strategy or do I stick with one of my less-performing ones with a defensible thesis. At the end of the day I want to be going up to people in much higher tax brackets and showing them my strategies and I'd love to show someone something like this but it's hard to justify a "idk it just works" to someone for a 6 or 7 figure investment.
No there is no look-ahead bias, may fall slightly to survivorship bias but I think the effects are minimal.