u/Clover_Dale — reddlx

This might be a silly question that really shows off my ignorance, but I'm stumbling on this question! In the agronomy/ crop science/ weed science papers I'm reading, data from field trials may be analyzed and presented within years or pooled across years, depending on the presence of significant year by treatment interactions. My first interpretation of this is the following workflow:

Test a "full" model with year, treatments, and their interactions as fixed factors.
After checking model fit and assumptions, run an ANOVA to check for significant interactions.
a. If there are no significant year by treatment interactions continue on to post-hoc analyses (after maybe fitting a new model with year as a random factor if appropriate?); OR

b. If there are significant year by treatment interactions, literally split the data by year and fit a separate model for each year, conducting subsequent ANOVA and post-hoc tests for each model.

It occurred to me that this could also be interpreted as keeping the full model with data pooled across years, but only drawing conclusions from emmeans grouped by year.

In the project I'm currently analyzing, I have multiple response variables, some of which have year by treatment interactions while others do not. I've been using the first approach, but could I have been wasting my time fitting so many models and cutting down my sample sizes?

Again, I apologize if this is a silly question, I look forward to any thoughts on the topic! TYIA!