How do I optimize SQL queries for large datasets as a beginner data analyst?
I'm a beginner data analyst (about 6 months into learning SQL) and I'm working with a retail sales dataset that has around 500,000 rows. My queries are running quite slow and I'm not sure where to start with optimization.
Here's a typical query I'm running:
SELECT product_category, SUM(sales_amount) as total_sales
FROM sales_data
WHERE sale_date BETWEEN '2024-01-01' AND '2024-12-31'
GROUP BY product_category
ORDER BY total_sales DESC;
This takes about 8-10 seconds to run. I've heard about indexing but I'm not sure how and where to apply it. Any tips or resources would be really helpful!
u/Only-Economist1887 — 14 hours ago