Model Averaging glmer problems

Question

I am very new to this with limited statistical experience, so please bear with me. I am trying to run model average on my data using glmer.

My data has 3 explanatory categorical variables and have successfully run dredge() on them and their interactions to get AICc values. However, when I run model.avg() I get output for some of the models, but no output with others. This is what I have input.

ae <- read.csv(file=file.choose())
options(na.action="na.fail")
global.model<-glmer(
     cbind(numerator,total-numerator)~d+s+t+d:s:t+d:s+d:t+s:t+(1|random), 
     data=ae, family=binomial)   
options(max.print=1000000)
dredge(global.model,beta=c("none"),evaluate=TRUE,rank="AICc") 
ae.model <- glmer(
     cbind(numerator,total-numerator)~d+s+t+d:s:t+d:s+d:t+s:t+(1|random),
    data=ae,family=binomial)
models <- dredge(ae.model)  
summary(model.avg(get.models(models,subset=delta<5)))

An error message comes up:

Error in model.avg.default(get.models(models, subset = delta < 5)) : models are not unique. Duplicates: '2 = 3 = 4' and '10 = 11'

I really don't understand where I am going wrong and why I am getting an output for some interactions and not others.

Thanks in advance for any help given.

summary(ae)
  p                   t           day             hour            scan             random    behaviour  
 ae:182   blood        :42   Min.   :1.000   Min.   :1.000   Min.   : 0   ae_blood_1_1:  7   alert:182  
          egg          :35   1st Qu.:1.000   1st Qu.:1.000   1st Qu.:10   ae_blood_1_2:  7              
          repellentfree:63   Median :2.000   Median :2.000   Median :30   ae_blood_1_3:  7              
          wolf         :42   Mean   :1.654   Mean   :1.962   Mean   :30   ae_blood_2_1:  7              
                             3rd Qu.:2.000   3rd Qu.:3.000   3rd Qu.:50   ae_blood_2_2:  7              
                             Max.   :3.000   Max.   :3.000   Max.   :60   ae_blood_2_3:  7              
                                                                          (Other)     :140              
   numerator           total      proportion        percentage      d                        s     
 Min.   : 0.0000   Min.   :17   Min.   :0.00000   Min.   : 0.000   E :14   1 - very light wind:21  
 1st Qu.: 0.0000   1st Qu.:17   1st Qu.:0.00000   1st Qu.: 0.000   SE:84   2 - light wind     :70  
 Median : 0.0000   Median :17   Median :0.00000   Median : 0.000   SW:35   3 - moderate wind  :77  
 Mean   : 0.5824   Mean   :17   Mean   :0.03426   Mean   : 3.426   W :49   4 - heavy wind     :14  
 3rd Qu.: 0.0000   3rd Qu.:17   3rd Qu.:0.00000   3rd Qu.: 0.000                                   
 Max.   :16.0000   Max.   :17   Max.   :0.94118   Max.   :94.118

It's hard to tell without a reproducible example. I've tried a couple of ways to regenerate a data set that looks like yours and haven't been able to replicate the problem. I'm guessing that it has to do with the way you've explicitly specified the interactions in the model. What happens if you use `...~d*s*t+(1|random)` instead? — Ben Bolker, Apr 14 '18 at 20:36
Apologies for delay in response....didn't think anyone would respond that quickly and thank you for the edits. Would it be easier for you if you could see my raw data file (not sure how to upload though????)? I am trying to look at alert behaviour in deer with 'd' - wind direction, 's' - wind speed, 't' - treatment (odour repellent). — Daisy26, Apr 15 '18 at 14:15
I did and I didn’t get an output. Just same error message I’m afraid. — Daisy26, Apr 15 '18 at 17:14
as a start, can you edit your question to show the results of `summary(ae)`? — Ben Bolker, Apr 15 '18 at 17:23
Oops, added numerator in so I could use proportion rather than percentage. Lots of warning messages saying columns have been dropped? — Daisy26, Apr 15 '18 at 19:56
wasn't suggesting you should add `numerator` to your predictor variables. Just wanted to know what the marginal distribution of your `numerator` variable was. At this point it might be best, if possible, if you post your data somewhere online and edit your article to include a URL - hard to give you any more advice about how to create a reproducible example (although see https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example ) — Ben Bolker, Apr 15 '18 at 20:05
Did you try this with `d*s*t` instead of `d+s+t+d:s+s:t+d:t+d:s:t` as I suggested? — Ben Bolker, Jun 07 '18 at 18:03
Would yourself (or anybody else) have any further suggestions of how I can recifty this problem? Many thanks in advance if anyone is able. — Daisy26, Jun 14 '18 at 12:48
just to clarify: when you used `d*s*t` in your formula, **exactly** what happened when you used `model.avg()` ? — Ben Bolker, Jun 15 '18 at 00:46
I had an error message stating that my models were not unique and there was a a lot of duplicates. I do not understand how they can be duplicates? — Daisy26, Jun 15 '18 at 10:43
I also had a warning message stating that the models had failed to converge for one of my places — Daisy26, Jun 15 '18 at 11:08
Hi Ben, my apologies if I appear to be consistently asking questions to yourself, but you are the only one who has tried to help me. Would you have any further ideas as to what I could try to do next or are you at a dead end? Thank you for all your help so far. It is very much appreciated. — Daisy26, Jun 19 '18 at 16:55
I can appreciate your desperation ... I will have a look at the data you sent today if I get a chance. — Ben Bolker, Jun 19 '18 at 17:55

score 0 · Answer 1 · answered Apr 15 '18 at 19:55

Guessing that the problem is with some of your interaction terms being redundant with each other because of some combination of your experimental design (which combinations of factors are actually represented) and the way you've written your factor. Guessing that you might have better luck expressing your model formula as cbind(numerator,total-numerator)~d*s*t+(1|random), which will make it easier for R to automatically exclude redundant terms.

I'm having trouble reproducing this. (This isn't exactly an answer, but too long for a comment ...) When I sampled factor levels randomly to get the same number of replicates as in your data set, most of the interaction terms ended up being collinear so the model more or less collapsed. I constructed a factorial design (balanced, with 4x4x4x2 = 128 total observations) and added the rest of the necessary variables randomly:

set.seed(101)
ae <- expand.grid(d=c("E","SE","SW","W"),
                 s=c("very_light","light","moderate","heavy"),
                 t=c("blood","egg","rf","wolf"),
                 rep=1:2)
ae <- data.frame(ae,
      random=sample(LETTERS,size=nrow(ae),replace=TRUE),
      total=17,
      numerator=sample(c(0,16),prob=c(0.96,0.04),replace=TRUE,size=nrow(ae)))

(Note that the marginal distribution of your response variable is very skewed -- the third quartile is zero, max is 16/17, mean is only about 0.5, which implies you have mostly zeroes with a few large values. A binomial model might not work very well.)

This slightly stripped-down version of your code produces lots of warnings (in part because there's no actual signal in the response variable), but no errors (I used subset=TRUE to model-average all of the models because there was only one model with delta-AIC<5 in this example set):

library(lme4)
library(MuMIn)
options(na.action="na.fail")
ae.model <- glmer(
  cbind(numerator,total-numerator)~d+s+t+d:s:t+d:s+d:t+s:t+(1|random),
  data=ae,family=binomial)
models <- dredge(ae.model,trace=TRUE)  
summary(model.avg(get.models(models,subset=TRUE)))

Thank you so much for your help...I shall have a thorough look at this with my data tomorrow. Very grateful. Thank you! — Daisy26, Apr 15 '18 at 20:11

Model Averaging glmer problems

1 Answers1