0

I would like to run a GLMM with multiple random factors using the function glmer in package lme4.

I have a dataset on marine debris like this:

  • count density: numeric
  • year: categorical, two levels
  • round: categorical (each year has its own six rounds, so round is - nested in year)
  • monitoring site: categorical (data is measured on each monitoring site 6 times a year, so round is crossed with monitoring site)
  • waters: categorical (each waters has several different sites, so monitoring site is nested in waters)
  • material: categorical

I would like to know if the count densities of marine debris is significantly different between/among years, rounds, waters and materials. So I put-in this:

glmm <- glmer(count density~material*(1|year/round)*(1|waters/monitoring sites),
    family=Poisson)

Could you please let me know if my formula is right?

And I can get nothing from the model, as I typed in:

glmm

It said:

Error: object 'glmm' not found

So what's the right way to use glmer?

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
A. Caikov
  • 51
  • 5
  • 1
    You must have gotten several errors. What were they? Some more questions/comments: (1) variables with spaces in them are problematic (at the very least they have to be protected with back-ticks); (2) it's "poisson" (lowercase), not "Poisson" (3) terms should be connected with `+` not `*` – Ben Bolker Apr 21 '19 at 01:46
  • Also note that if each `monitoring site` has its own level (its own name), then you don't need to nest within `waters`, that is redundant. – Dylan_Gomes Apr 22 '19 at 23:35

1 Answers1

1

At the very least (if your variable names really have spaces in them, which is generally a bad idea, see e.g. this question) you should try:

glmm <- glmer(`count density` ~ material+(1|year/round)+
              (1|waters/`monitoring sites`), 
              family=poisson)

Also note that year won't work well as a random effect because it only has two levels (it's hard to estimate a variance from only two observations: see e.g. these simulations), so maybe

glmm <- glmer(`count density` ~ material+year+(1|year:round)+
               (1|waters/`monitoring sites`), 
              family=poisson)

would be better.

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • "year won't work well as a random effect because it only has two levels" -- because? – thc Apr 21 '19 at 04:34
  • 2
    Maybe because you'd be estimating the variance of `year` based on only 2 observations. It also makes little sense from a theoretical point of view: Years aren't drawn randomly from some larger population of years. – Frans Rodenburg Apr 22 '19 at 02:43
  • why the downvote? happy to try to address concerns if they are raised explicitly ... – Ben Bolker Apr 22 '19 at 05:01
  • @FransRodenburg two levels does not mean two observations. I also disagree that it doesn't make sense (necessarily). You could contrive a situation when it would make sense. – thc Apr 22 '19 at 06:58
  • 1
    but two levels is the relevant value for the question of estimating the among-year variance. @thc, I do agree that using year as a random effect could make sense. But it is very unlikely to be practical with two years (see the simulation linked above, and the GLMM FAQ, and many discussions on r-sig-mixed-models). – Ben Bolker Apr 22 '19 at 13:17
  • I don't see why it is unpractical. Let's say we have a strong belief that `year` variable does contribute via some physical random process -- how does nesting it with `round` improve our estimation of the random process due to `year`? Regarding the simulation link: the s.d. is biased, but the variance is not biased. – thc Apr 22 '19 at 23:56