Need reliable source for 30+ years of S&P 500 historical data for LSTM/Transformer research [P]
Hi everyone,
I'm starting a research project on financial time-series forecasting using LSTM and Transformer models for predicting S&P 500 market direction.
Right now, I'm struggling with obtaining reliable long-term historical data.
I tried Yahoo Finance, but downloads are inconsistent/failing for me, and most Kaggle datasets I found only contain around 5–10 years of data.
I specifically need:
- Around 30 years of historical S&P 500 data
- Preferably daily OHLCV data
- Reliable and clean source suitable for ML research
- Ideally free or student-friendly
I also want to understand what researchers typically use in academic work for financial forecasting:
- Yahoo Finance?
- Alpha Vantage?
- WRDS/CRSP?
- Polygon?
- Kaggle?
- Something else?
Additionally:
- Is using only S&P 500 index data enough for a Master's level research project?
- Or should I include technical indicators, macroeconomic data, sentiment, or constituent stock data?
Would appreciate guidance from people who've actually worked on financial ML projects.
Thanks.
u/stickPotatoe — 1 day ago