r/stata
Hi! We're currently writing a thesis about conditional vs unconditional fiscal transfers of selected cities within a given time period. I'd like to ask, what statistical tests do we need to conduct for us to strengthen our model?
Currently we did this tests: hausman, wald test, wooldridge, vif, overall significance, individual significance. With panel data, should we only look at the within R² for the goodness of fit of the model?
Hope you could help us. Thank you!
I need help with producing some Stata code from an academic paper. I created a do file and would like to verify that they are correct before proceeding with the empirical analysis. Specifically, I am trying to construct the money center roadshow variable, which is defined as an indicator equal to 1 for a three-day window in which the firm has flights to two or more money centers, and 0 otherwise. The money centers are Boston, Chicago, New York, and San Francisco. I appreciate the help!
>
Long time reader, first time poster here.
Are you intrigued by or already using coding agents?
Me too. But agents hallucinate, confidently make decisions we don't approve of, and only sometimes disclose their assumptions. Also their style is often odd 😂
The software dev/ datascience community has approached this problem for a while now using all sorts of tools and guardrails for agents. The hottest one on the block right now: Agent skills
Over the years of doing econometrics I developed my own set of favoured approaches, tools, assumptions, etc. (as I am sure you all have too!)
I packaged mine into a set of rules and skills for my coding agents & I am a bit shocked HOW MUCH BETTER they get at doing things the way I want.
I built these to streamline my own econometrics research. The defaults that ship with general-purpose AI tools are uneven: they happily generate plain TWFE on staggered treatment, report F > 10 as a sufficient first-stage test, paste regression numbers into LaTeX by hand, and mix red-green palettes for treatment vs. control. The skills here force the agent into the patterns I want (and the patterns I think most applied economists should want).
They are deliberately highly opinionated. The opinions come from:
- DIME Analytics' DIME Wiki:
iefolder,iecodebook,ieduplicates, master do-files, the four-tier replicability standard, the Reviewing Graphs and Submit Table checklists, and the general "single source of truth + master orchestrator" mindset (translated to Python, Julia, and LaTeX where DIME's guidance is Stata-only). - Modern econometrics literature: Goodman-Bacon (2021), Callaway-Sant'Anna (2021), Sun-Abraham (2021), Borusyak-Jaravel-Spiess (2024), de Chaisemartin-D'Haultfoeuille; Olea-Pflueger (2013) and Lee et al. (2022) for weak IV; Calonico-Cattaneo-Titiunik (2014) for RDD; Cameron-Gelbach-Miller (2008) and MacKinnon-Webb (2018) for wild cluster bootstrap; Roth, Sant'Anna, Bilinski & Poe (2023) for the modern DiD landscape.
- Modern packages:
fixest/pyfixest,did,didimputation,eventstudyinteract,csdid,did_imputation,boottest/fwildclusterboot,rdrobust,linearmodels,modelsummary/stargazer/esttab.
The repo was inspired by meleantonio/awesome-econ-ai-stuff — the original curated catalog. This is a narrower, more opinionated rewrite focused on four workflow stages.
Have a look, use, critique, contribute if you fancy: https://github.com/JonasWeinert/EconAgentSkills