How to run a generalized linear mixed model (GLMM) with multiple random factors?

Question

I would like to run a GLMM with multiple random factors using the function glmer in package lme4.

I have a dataset on marine debris like this:

count density: numeric
year: categorical, two levels
round: categorical (each year has its own six rounds, so round is - nested in year)
monitoring site: categorical (data is measured on each monitoring site 6 times a year, so round is crossed with monitoring site)
waters: categorical (each waters has several different sites, so monitoring site is nested in waters)
material: categorical

I would like to know if the count densities of marine debris is significantly different between/among years, rounds, waters and materials. So I put-in this:

glmm <- glmer(count density~material*(1|year/round)*(1|waters/monitoring sites),
    family=Poisson)

Could you please let me know if my formula is right?

And I can get nothing from the model, as I typed in:

glmm

It said:

Error: object 'glmm' not found

So what's the right way to use glmer?

You must have gotten several errors. What were they? Some more questions/comments: (1) variables with spaces in them are problematic (at the very least they have to be protected with back-ticks); (2) it's "poisson" (lowercase), not "Poisson" (3) terms should be connected with `+` not `*` — Ben Bolker, Apr 21 '19 at 01:46
Also note that if each `monitoring site` has its own level (its own name), then you don't need to nest within `waters`, that is redundant. — Dylan_Gomes, Apr 22 '19 at 23:35

Ben Bolker · Answer 1 · 2019-04-22T05:01:18.013

1

At the very least (if your variable names really have spaces in them, which is generally a bad idea, see e.g. this question) you should try:

glmm <- glmer(`count density` ~ material+(1|year/round)+
              (1|waters/`monitoring sites`), 
              family=poisson)

Also note that year won't work well as a random effect because it only has two levels (it's hard to estimate a variance from only two observations: see e.g. these simulations), so maybe

glmm <- glmer(`count density` ~ material+year+(1|year:round)+
               (1|waters/`monitoring sites`), 
              family=poisson)

would be better.

edited Apr 22 '19 at 05:01

answered Apr 21 '19 at 01:49

Ben Bolker

211,554
25
370
453

"year won't work well as a random effect because it only has two levels" -- because? – thc Apr 21 '19 at 04:34
2

Maybe because you'd be estimating the variance of `year` based on only 2 observations. It also makes little sense from a theoretical point of view: Years aren't drawn randomly from some larger population of years. – Frans Rodenburg Apr 22 '19 at 02:43
why the downvote? happy to try to address concerns if they are raised explicitly ... – Ben Bolker Apr 22 '19 at 05:01
@FransRodenburg two levels does not mean two observations. I also disagree that it doesn't make sense (necessarily). You could contrive a situation when it would make sense. – thc Apr 22 '19 at 06:58
1

but two levels is the relevant value for the question of estimating the among-year variance. @thc, I do agree that using year as a random effect could make sense. But it is very unlikely to be practical with two years (see the simulation linked above, and the GLMM FAQ, and many discussions on r-sig-mixed-models). – Ben Bolker Apr 22 '19 at 13:17
I don't see why it is unpractical. Let's say we have a strong belief that `year` variable does contribute via some physical random process -- how does nesting it with `round` improve our estimation of the random process due to `year`? Regarding the simulation link: the s.d. is biased, but the variance is not biased. – thc Apr 22 '19 at 23:56

How to run a generalized linear mixed model (GLMM) with multiple random factors?

1 Answers1