3

I'm having trouble understanding why my coef() call is returning the same intercept and slope for every participant in my data.

For context, I am comparing two models (built in lmer) using the anova function.

Model 1 is as follows model1 <- lmer(Pen ~ wave + (1 | id), data = no_missing, REML = FALSE)

And model 2 adds a variable of interest QEL and is model2 <- lmer(Pen ~ wave + QEL + (1 | id), data = no_missing, REML = FALSE)

When I run anova(model1, model2) I get the results as expected. But, my issue arises when I go to look at the coefficients (coef()).

I'm wondering why the intercept and slope (below) are the same for everyone? Have I not put my models together correctly to get an intercept for each person (i.e., are they based on a fixed effect rather than random effect)?

model1 and model2 coef() output:

$id
   (Intercept)     wave
1     74.66694 17.31497
7     74.66694 17.31497
10    74.66694 17.31497
11    74.66694 17.31497
13    74.66694 17.31497
14    74.66694 17.31497
15    74.66694 17.31497
16    74.66694 17.31497
18    74.66694 17.31497
28    74.66694 17.31497
29    74.66694 17.31497
30    74.66694 17.31497
31    74.66694 17.31497
32    74.66694 17.31497
33    74.66694 17.31497
34    74.66694 17.31497
35    74.66694 17.31497
36    74.66694 17.31497
37    74.66694 17.31497
38    74.66694 17.31497
39    74.66694 17.31497
40    74.66694 17.31497

attr(,"class")
[1] "coef.mer"```  


> coef(model2)
$id
   (Intercept)     wave      QEL
1      36.8735 16.18188 0.436023
7      36.8735 16.18188 0.436023
10     36.8735 16.18188 0.436023
11     36.8735 16.18188 0.436023
13     36.8735 16.18188 0.436023
14     36.8735 16.18188 0.436023
15     36.8735 16.18188 0.436023
16     36.8735 16.18188 0.436023
18     36.8735 16.18188 0.436023
28     36.8735 16.18188 0.436023
29     36.8735 16.18188 0.436023
30     36.8735 16.18188 0.436023
31     36.8735 16.18188 0.436023
32     36.8735 16.18188 0.436023
33     36.8735 16.18188 0.436023
34     36.8735 16.18188 0.436023
35     36.8735 16.18188 0.436023
36     36.8735 16.18188 0.436023
37     36.8735 16.18188 0.436023
38     36.8735 16.18188 0.436023
39     36.8735 16.18188 0.436023
40     36.8735 16.18188 0.436023

attr(,"class")
[1] "coef.mer"
jbrimm2004
  • 57
  • 3
  • @GregorThomas yes, but `coef()` can basically be thought of as `fixef()` + `ranef()` so you should see different values for each `id` if the model is estimating random intercepts for each `id`. The `wave` and `QEL` coefficients should be the same for each `id`, but the intercepts should be different. Looking at this, it makes me think maybe the models are giving a singular fit where the random intercept coefficient is zeroed out. The data aren't permitting `lmer()` to estimate separate intercepts for each `id`. Either that, or there is an issue with the data formatting. – qdread Apr 14 '22 at 16:24
  • 1
    @jbrimm2004 do you get a singular fit warning when fitting either of these models? – qdread Apr 14 '22 at 16:24
  • 1
    @qdread yes, yes I do! Is this the cause of the issue and is it due to a very small random effect? Also, thank you for the reply – jbrimm2004 Apr 14 '22 at 16:35
  • Yes, the random effect can't be supported by the data so (to put it in an oversimplified way) `lmer` "gives up" and calls all the random intercepts zero. This often occurs when you only have a small number of data points per subject. This question on stats stackexchange might help you: https://stats.stackexchange.com/questions/378939/dealing-with-singular-fit-in-mixed-models . There are a few solutions proposed there. My favorite solution is to refit the model in a Bayesian framework which can deal with the small sample size issue much better. – qdread Apr 14 '22 at 18:24
  • Also congrats on your first stackoverflow question! In the future I would recommend checking out the tutorials on making a "reproducible example"... if you had provided code in your question that I could run on my own machine to reproduce your error (or in this case warning), it would have been a lot easier and faster to get to the solution of your problem. See https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – qdread Apr 14 '22 at 18:27
  • 1
    @qdread Thank you so much for the help and links. The study does have a small sample size and as a result a small n of data points per person as well (mainly because of COVID....). I'll check out my Bayesian options with the link you provide. Also, thank you for the link about "reproducible code". I hadn't seen that but I see how it can make everyone's life easier! – jbrimm2004 Apr 14 '22 at 19:15
  • I decided to write up the comments as answer. Please consider upvoting and accepting if it's helpful! – qdread Apr 15 '22 at 00:49

1 Answers1

1

A situation like this is often a result of a lmer() call that returns a singular fit. The random effect can't be supported by the data so (to put it in an oversimplified way) lmer "gives up" and calls all the random intercepts zero.

In the case of model1 and model2, the model has only a random intercept for each id and no random slopes. So if the random intercepts had non-zero estimates, coef(model1) would show a different intercept coefficient for each id, but the wave slope coefficient would be the same in each row.

This often occurs when you only have a small number of data points per subject. This question on stats stackexchange might provide some help: https://stats.stackexchange.com/questions/378939/dealing-with-singular-fit-in-mixed-models. There are a few solutions proposed there. My favorite solution is to refit the model in a Bayesian framework which can deal with the small sample size issue much better. See also How to cope with a singular fit in a linear mixed model (lme4)?.

qdread
  • 3,389
  • 19
  • 36