r/DataScientist

How are you keeping up with AI updates these days?

I’ve been running into the same issue recently—too many sources (research blogs, company updates, media), and a lot of overlap or noise.

I built a small pipeline to experiment with this:

ingestion from curated sources
deterministic filtering + deduplication
LLM-based scoring (relevance, importance, novelty)
clustering of related content
structured digest output

Main goal was to reduce context switching and make it easier to focus on what actually matters.

Curious how others here approach this—tools, workflows, or habits?

reddit.com

u/Elinova_3911 — 14 hours ago

▲ 7 r/DataScientist+2 crossposts

[FOR HIRE] Data Scientist / ML Engineer / AI Engineer | 4 YOE | Python, XGBoost, LightGBM, LLMs, MLflow, Spark | Remote | Full-time or Contract

---

**Who I am**

Hi! I'm Keshav, a Data Scientist, ML Engineer, and AI Engineer with ~4 years of experience building production ML and AI systems — from raw feature engineering to model deployment and monitoring. I specialize in taking models from experimentation all the way to production.

---

**What I do well**

🔹 Supervised/Unsupervised ML — XGBoost, LightGBM, scikit-learn, PyTorch

🔹 LLM & GenAI pipelines — RAG, prompt engineering, fine-tuning, agentic workflows

🔹 MLOps — MLflow, Docker, Kubernetes, Airflow, CI/CD for model deployment

🔹 Data Engineering — BigQuery, Snowflake, Spark, dbt, SQL at scale

🔹 FastAPI-based ML services & REST API productionization

---

**Recent portfolio highlights**

📌 **CreditSense AI** — End-to-end credit risk scoring product built with FastAPI, XGBoost/LightGBM, deployed on Railway. Targets the Indian fintech market.

→ github.com/keshavloma1081-ctrl/Creditsense-ai

📌 AI evaluation & annotation tasks including agentic coding evals comparing LLM model responses (Labelbox).

---

**Stack at a glance**

---

**Availability**

📍 Based in Delhi NCR · Open to fully remote roles globally

🕐 Available for: Full-time employment, long-term contracts, or project-based freelance

💬 DM me — happy to share CV, portfolio, or jump on a call.

---

**Best fit for**

Fintech · Healthcare AI · Analytics platforms · LLM-powered products · Fraud/Risk modeling · Data pipelines · AI/ML startups

reddit.com

u/New_Conclusion_2211 — 1 day ago

▲ 2 r/DataScientist

Postcode is one of the most underrated features in modelling

One thing that has consistently surprised me across different companies is how strong postcode features tend to be in models.

At first glance, it's surprising that it's so predictive (it's "just geography facts"), but then it clicks: people tend to live in areas with somewhat likeminded people, and the (visible) area-level behaviours often correlate well with the individual behaviours that we're interested in.

The features that are captured for each postcode,

demographics
deprivation
housing characteristics
crime exposure
transport access
general behaviour patterns

are proxies for behaviours that are hard to observe directly: renewal propensities, fraud, risk.

The other issue is that postcode data is rarely "done properly". It's often:

built once and never updated
very incomplete
or treated as a static lookup rather than something that evolves over time

Of course, there are important considerations around fairness and bias here, since geographic features can correlate with socio-economic factors. In practice, how these features are used depends heavily on the application and regulatory context.

Curious how others are handling this -- do you tend to use postcode features, or is it something that gets deprioritised?

reddit.com

u/Sweaty-Stop6057 — 17 hours ago

▲ 1 r/DataScientist+1 crossposts

App that tells you exactly what is wrong in your Python code

u/Few_Definition5707 — 1 day ago

▲ 1 r/DataScientist+1 crossposts

Tensorflow

u/Few_Definition5707 — 3 days ago

▲ 1 r/DataScientist

What technical skills are covered in a Data Science course in Bangalore?

u/Royal-Prune3496 — 2 days ago

▲ 0 r/DataScientist

How are you benchmarking forecasting models across classical, ML, and deep learning approaches?

u/Ankur_Packt — 4 days ago

▲ 1 r/DataScientist

Anyone else tired of babysitting Colab notebooks? I built a way to run them like jobs

u/jerronl — 2 days ago

▲ 3 r/DataScientist

How would you measure response diversity in an AI chatbot?

u/OkTraffic2096 — 4 days ago

▲ 13 r/DataScientist

Is Statistics a good major to pick if I want to pursue Data Science?

So I've gotten the chance to study study statistics at one of the best universities in my country . It's almost free of cost. I've also got the opportunity to study computer science at another university but it'll be too expensive for me.

So I guess my question is can I still become a data scientist by studying statistics?

reddit.com

u/Peasent_in_Yellow28 — 8 days ago

▲ 5 r/DataScientist

Disillusioned with the "DS" job market. Is switching to SWE the only way to keep doing actual engineering?

u/Cultural_Record7515 — 7 days ago

▲ 4 r/DataScientist+1 crossposts

Quant researcher → Data Scientist pivot - worth it?

Hi all, I'm making a huge life decision and deciding between 2 job offers, so I would really appreciate perspectives from people in the DS field.

For some background, I’m currently a quantitative researcher working in corporate bond trading at a large bank in NYC. My work is fairly modeling-heavy (pricing, analytics) so I have strong research skills but not as much experience with the more formal DS workflow or software (Spark, Hadoop, AWS, etc).

Offer 1 (NYC) - Quant researcher role at a company that builds fixed income pricing models (company is a vendor to trading firms, so more product-focused, not actually trading)

Higher compensation
Stronger alignment with my current skillset
Similar to 'Applied Scientist' roles at some tech firms and has strong data science component (tech stack, release cycles, product focus)
I'm really excited about this role as it marries my experience with my desire to get away from the day-to-day stress of trading.

Offer 2 (Chicago) - Data scientist at a consumer credit agency. Role would focus on credit risk modeling for clients.

More traditional DS role.
Located in Chicago (my family and I would ideally like to live there long-term)
However, I do like the idea of a role in consumer credit risk. It's practical, there will always be demand for it and there are lots of companies to transition to (PayPal, Stripe, Capital One, etc).

Goals / concerns:

Chicago is a preferred long-term location for personal/lifestyle reasons.
In a perfect world, I could do the quant job in Chicago but there are no companies like that there.
I also wouldnt mind staying in NYC for a few more years

before looking in DS again

but my concern is that I'm missing a golden opportunity to relocate and break into DS that I might not get again, even though the role itself is suboptimal.
I really want to get away from the day-to-day aspect and PnL pressure of trading so I wouldn't want to transition to a pricing role at a Chicago prop shop

How I’m thinking about it:

The DS role is a more direct path into the field (especially for credit/lending/fintech roles later) but it comes with a pay cut and potentially weaker long-term growth at that specific company
The quant role keeps me on a strong comp/skill trajectory, but makes the DS pivot less direct and requires more intentional repositioning. It also maintains the friction of transitioning cities as well as jobs, down the line.

Questions:

Does starting in the credit-focused DS role meaningfully improve long-term opportunities vs transitioning later or would my more unique background from the pricing role help me stand out?
Am I underestimating how competitive DS roles are for someone without direct experience?
Would taking a pay cut now for a “cleaner” transition path be worth it in your view?

Appreciate any thoughts, especially from people who’ve made similar transitions or hired for DS roles.

Thanks!

edit: to be sure, the options i’m considering are either take the chicago DS job now or take NYC quant job now and look for better-paying DS job in Chicago in a few years.

reddit.com

u/Grouchy-Load562 — 6 days ago

▲ 0 r/DataScientist

실시간 데이터 스트림 내 자막 오기입과 초동 대응 프로세스의 상관관계

u/mattkahnn — 4 days ago

▲ 9 r/DataScientist

Final Year Cv

need to apply for data scientist position , what do I need to improve

u/xoticskull — 8 days ago

▲ 1 r/DataScientist+4 crossposts

i m looking for data science trainers only from india

u/Ok_Ambition_7981Nk — 7 days ago

▲ 2 r/DataScientist+1 crossposts

Macbook pro vs Asus G14

I have the doubt which laptop is better for data science between macbook pro m5 and asus g14 rtx 5070 ti. Both with 32 gbs ram. I want a laptop for a data science master.

reddit.com

u/NeedleworkerWeak6192 — 7 days ago

▲ 2 r/DataScientist

Testing a New Product for Data Science Beginners

I am building a platform for beginner data science students.

The goal is to help students build projects on their own without depending completely on long project tutorials.

Instead of giving the full project directly, the platform breaks the project into small tasks so students can think, build, and learn step by step.

I want to understand:

Whether this approach feels useful
Which parts feel confusing
Where students get stuck
Whether it feels better than watching full tutorials

I am not selling anything right now. I only want honest feedback from people who are learning data science.

Website - https://sted.co.in/

reddit.com

u/Jealous_Parfait_6457 — 7 days ago

▲ 1 r/DataScientist

[Selling] German Job Market Dataset - 150K Indeed.de listings (April 2026) - 38 fields including salary data

Fresh scrape from Indeed . de (April 2026). Perfect for ML, research, or HR analytics.

📊 What you get:
- 150,936 unique jobs
- 38 fields: title, company, description, location, salary flags, apply counts, ratings
- CSV format (~455MB)
- 100% valid data, no duplicates

📥 Free sample (5,000 jobs): IN COMMENTS

💰 Price: 200 USD
📦 Delivery: 2h

🎯 Use for:
- Job market research
- ML training data
- Salary benchmarking
- Competitive intelligence

Tg: @ gdataxxx

reddit.com

u/dracariz — 5 hours ago

▲ 1 r/DataScientist

Credit Risk Modeling using Python

u/credit-risk-modeling — 10 hours ago

▲ 1 r/DataScientist

Need a Data Science study buddy (daily)

u/CornerRecent9343 — 1 day ago