How to report random effect in the mixed effects model

Question

I use the iris dataset in R as the example below.

library(lme4)
mixed.fit.b <- lmer(Sepal.Width ~ Sepal.Length + (1+ Sepal.Length|Species), data = iris)
summary(mixed.fit.b)

ranef(mixed.fit.b)$Species
coef(mixed.fit.b)$Species
predict(mixed.fit.b)

The random intercept and slope are shown below.

           (Intercept) Sepal.Length
setosa     -0.49549054   0.78331501
versicolor  1.19197858   0.26689317
virginica   1.17260303   0.27282273

That means the intercept is -0.49549054 (fixed + random intercept) and slope is 0.78331501 (fixed + random slope) for setosa right? So, there are three couples of intercepts and slopes. In a general linear model, we can say the y = intercept + slope and the y changed a slope per x. But in mixed models, there are three three couples of intercepts and slopes. How to report them?

"How to report them?" Usually not at all. If they were something you'd want to report, I'd question why you consider Species a random effect. Usually, I'd report the fixed effects and the variances of the random effects. (A random effect with only three subjects is very questionable. That's not enough to estimate variances reliably. A fixed effects model should be used here.) — Roland, Jan 26 '23 at 07:55
I agree with your point about random effects clusters but I have to respectfully disagree about the random effects reporting. While it is not common practice, it really should be, and I detail why below. — Shawn Hemelstrand, Jan 26 '23 at 11:16
@ShawnHemelstrand I did say that the variances (and yes, also the correlations) should be reported. OP's question is more specific than asking how to report random effects. I prefer to [depict subject-level predictions](https://stats.stackexchange.com/a/99049/11849) instead of coefficients. — Roland, Jan 27 '23 at 06:11
Ah I think I misread your intent there. I've amended my answer. — Shawn Hemelstrand, Jan 27 '23 at 06:20
Dear @Roland, sorry for the inappropriate example. I used iris dataset because it is built-in simple dataset. I totally agree with your worry. I'm not trying to illustrate a specific problem through this, sorry for the inappropriate expression. — li jiaqi, Jan 30 '23 at 04:29
No worries. I think it's just good to know in the future. By the way, the `lme4` and `lmerTest` packages actually contain datasets you can practice with. — Shawn Hemelstrand, Jan 30 '23 at 04:58

Shawn Hemelstrand · Accepted Answer · 2023-01-27T06:19:41.277

Recommendation

I agree with Roland that using a mixed model with three clusters is not advisable (see Gelman & Hill, 2007, linked below). Speaking to your main question, we would like to know how much random variation is being teased apart by the model, and in fact this is really supposed to be used as justification to use the model in the first place (see Meteyard & Davies, 2020, where they propose fitting random effects-only models as a precursor to your full model).

With this in mind, what I would recommend is reporting these parts of your model to account for the random effects (including what you mentioned). While they are not all necessary, they are certainly informative:

SD of random intercepts/random slopes
Correlation of any random effects (and if possible an explanation of why)
ICC of your model (this will explain how much clustering is occurring)
Pseudo R2, which tries to explain how much of the effects are explained by the fixed effects and how much are explained by both fixed and random effects.
Caterpillar plot of random effects.

Worked Example: Fitting the Model

I will continue to use the data you have as an example, but I will change it to something that converges, as your first model was singular, which is not usable when reporting. I have loaded four libraries necessary for the functions used and fit the model.

#### Load Libraries ####
library(lmerTest)
library(lattice)
library(broom.mixed)
library(performance)

#### Fit Model ####
fit <- lmer(Petal.Length
                    ~ Petal.Width
                    + (1 + Petal.Width |Species), 
                    data = iris)

Summarizing the Random Effect Variance

Next, we can summarize the model, but I prefer to use the tidy function from the broom.mixed package for quickly obtaining RE estimates.

##### Summarise Model ####
tidy(fit)

Shown below:

# A tibble: 6 × 8
  effect   group    term                      estim…¹ std.e…² stati…³    df p.value
  <chr>    <chr>    <chr>                       <dbl>   <dbl>   <dbl> <dbl>   <dbl>
1 fixed    NA       (Intercept)                 2.43    0.898    2.71  1.97   0.115
2 fixed    NA       Petal.Width                 1.10    0.414    2.65  2.00   0.118
3 ran_pars Species  sd__(Intercept)             1.52   NA       NA    NA     NA    
4 ran_pars Species  cor__(Intercept).Petal.W…  -0.408  NA       NA    NA     NA    
5 ran_pars Species  sd__Petal.Width             0.646  NA       NA    NA     NA    
6 ran_pars Residual sd__Observation             0.362  NA       NA    NA     NA    
# … with abbreviated variable names ¹estimate, ²std.error, ³statistic

Based off this information, we now know:

The species vary in conditional average petal length by about 1.52 SD (after controlling for fixed and random effects).
By-species slopes vary less (.646 SD), indicating that the relationship between petal width and petal length doesn't change much by species.
The correlation between petal width and petal length is about -.41, indicating that as the average petal length increases, the slopes generally decrease.

The appendix for one of the cited articles above shows how you can report this information, which is the main question you had if I am not mistaken.

Caterpillar Plot of Random Effects

We can report this information in an article if necessary, but we can also plot these effects with a caterpillar plot:

#### Plot Dotplot of Random Effects ####
dotplot(ranef(fit))

This is partly why Roland warned you about having a higher number of random effect clusters. You only see three intercepts and slopes here, which doesn't capture a lot of variation that you would want to get from a mixed model. However, we can make two key insights here:

The virginica plant has much higher petal lengths on average (random intercepts) than the other species.
The versicolor plant has a higher slope term for petal widths than the others.

Other Random Effects Estimates

We can also obtain the ICC and pseudo R2 estimates with this code:

#### Check Performance ####
performance(fit)

Shown below:

# Indices of model performance

AIC     |    AICc |     BIC | R2 (cond.) | R2 (marg.) |   ICC |  RMSE | Sigma
-----------------------------------------------------------------------------
154.478 | 155.066 | 172.542 |      0.957 |      0.231 | 0.944 | 0.355 | 0.362

What to note from this summary:

The ICC indicates that there is a lot of random effects clustering...about 94.4% to be exact, indicating that most of the explained variance comes from the random effects in the model.
The conditional R2 is the variance explained by both fixed and random effects, which turns out to be 95.7%.
The marginal R2 states how much variance is accounted for by fixed effects, which is about 23.1%.

Citations

Below are the citations I mentioned earlier. Gelman & Hill is a canonical source for learning about mixed models. The article by Meteyard & Davies is a best-practice guide for running mixed models. Let me know if you found this answer helpful.

Gelman, A., & Hill, J. (2007). Data analysis using regression and multilevel/hierarchical models. Cambridge University Press.
Meteyard, L., & Davies, R. A. I. (2020). Best practice guidance for linear mixed-effects models in psychological science. Journal of Memory and Language, 112, 104092. https://doi.org/10.1016/j.jml.2020.104092

Thank you very much for your careful explanation and citations. That is what I want to know. Sorry for the inappropriate example. I used iris dataset because it is built-in simple dataset. I totally agree with you and Roland's worry. The best-practice guide is great and I will continue to read it. — li jiaqi, Jan 30 '23 at 04:23