r/epidemiology

Weekly Advice & Career Question Megathread

Welcome to the r/epidemiology Advice & Career Question Megathread. All career and advice-type posts must posted within this megathread.

Before you ask, we might already have your answer! To view all previous megathreads and Advice/Career Question posts, please go here. For our wiki page of resources, please go here.

reddit.com
u/AutoModerator — 2 days ago

Every pharmacovigilance database I try has a different wall. Is this study feasible without institutional access?

I posted in r/AskAcademia a week ago about being stuck on IRB funding for an independent public health study on caffeine product labeling. I got a lot of feedback telling me to slow down, get a faculty sponsor, and start with the systematic review before trying to collect primary data. I took that advice, but now I keep hitting new walls, and I am starting to feel like I am missing something obvious.

The "novel" contribution of the study is dose-tier stratification of caffeine adverse events. Caffeinated products vary enormously in caffeine content, a cup of coffee might have 80mg while a pre-workout might have 400mg, but no public database categorizes adverse event reports by how much caffeine was actually in the product involved. I hypothesize that if labeling failures are driving harm, adverse event increases should be concentrated in the highest dose products, the ones consumers are least able to accurately estimate.

The systematic review is registered on PROSPERO and moving forward. The survey arm is parked until I land a faculty sponsor. The database analysis is where I keep running into problems.

I pulled the publicly available HFCS data, the FDA food and dietary supplement adverse event database formerly known as CAERS. After filtering for caffeine-relevant products and ages 12-24 from 2014-2024, I have 238 records. The data has brand names so tier mapping is theoretically possible, but 238 records across 11 years and 4 tiers is too sparse for the regression I designed the analysis around. The trend also goes down rather than up, which may reflect reporting pattern changes rather than actual exposure trends.

NPDS has the volume I need. A 2025 paper found over 32,000 caffeine energy product exposures in NPDS from 2011-2023 among individuals under 20. I am submitting a formal non-member data request right now. The problem I just hit is that getting brand-level product identifiers requires written authorization letters from each brand owner. Without brand names I cannot map products to dose tiers and the whole point collapses.

I am requesting Poisindex product ID codes without brand names and planning to resolve the lookup problem when I have institutional access after transferring to a four-year university. But that could be a year away, and I am not sure the study holds together in the meantime.

I want to be clear that I am not complaining about the difficulty. I knew going in that this would be hard (as many of you also told me), and I have no illusions about my limitations as a first-year community college student doing this without institutional support. But I have put a significant amount of work into this, and I am afraid that the limitations I keep uncovering are compounding to the point where this whole arm of my project is not executable in its current form. I would rather hear that now from people who know more than I do than find out after another few months of work.

Is there a framing of this question that gets around the brand identification problem? Is there a database I have not found that captures caffeinated product adverse events with dose information already attached? Is the surveillance gap itself the publishable finding rather than the trend analysis I designed? Am I missing a perspective entirely?

reddit.com
u/cjfitguy — 2 days ago

Reading recommendations?

What readings, b00ks, reports, articles, would you recommend for someone with a masters in epidemiology and a few years of field experience? Looking for books to read in my own time to refresh memory and improve critical thinking for causality and bias. Could be anything fiction or non-fiction.

Thx!

reddit.com
u/punk-recluse-2834 — 5 days ago
▲ 56 r/epidemiology+1 crossposts

I am a PhD epidemiologist who had to retire due to dementia

I’ve tried to post my YouTube channel here before, but keep getting deleted by the mod bot, so I’m just posting again hoping that maybe my fellow epidemiologists will be able to see this post, please don’t delete me again, and support me in this involuntary journey, which is now my N of one real world evidence study.

https://youtube.com/@incasethisendsbadly

Ami

u/ResponsibleParking13 — 6 days ago

reCAPTCHA on PubMed Central can go to hell

What is going on with reCAPTCHA this month on PMC?? I don't want to spend 5 minutes clicking on pictures of cars to access an article.

reddit.com
u/implante — 6 days ago

R₀ estimate of 2.76 for the MV Hondius ANDV outbreak — how generalizable is this?

A recent preprint estimated the R₀ for the MV Hondius Andes hantavirus outbreak at 2.76 within the cruise ship setting, while cautioning against directly extrapolating that estimate to broader community transmission.
MV Hondius is a relatively small polar expedition vessel carrying roughly 170 passengers, with a more outdoor-focused itinerary than a typical large resort-style cruise ship. That made me curious how epidemiologists think about interpreting transmission estimates across different confined environments.

A few questions I’d appreciate expert perspective on:

  1. What would a reasonable community-level adjustment look like for a confined-setting R₀ estimate like this?

  2. Is it unusual that WHO hasn’t publicly published an R₀ estimate at this stage, or is that standard practice early in outbreaks with limited data?

  3. Given the 1–8 week incubation window, what epidemiological signals over the next several weeks would most strongly distinguish a contained cluster from broader transmission concerns?

Reuters also reported that French officials said full sequencing of the outbreak strain is still ongoing, which made me wonder how much uncertainty epidemiologists typically tolerate before becoming concerned about potentially unusual transmission dynamics in outbreaks like this.

Genuinely trying to better understand how epidemiologists interpret uncertainty during early outbreak stages, not imply conclusions beyond the available data.


Sources:
• Preprint: https://arxiv.org/abs/2605.07498
• ECDC outbreak update: https://www.ecdc.europa.eu/en/infectious-disease-topics/hantavirus-infection/surveillance-and-updates/andes-hantavirus-outbreak
• Reuters reporting on sequencing uncertainty: https://www.reuters.com/business/healthcare-pharmaceuticals/french-minister-says-it-is-not-certain-if-hantavirus-strain-cruise-ship-has-2026-05-12/

reddit.com
u/PieIcy4638 — 7 days ago