1

I would like to perform bootstrapping to obtain 95% Cis of my fixed effects in a binomial GLMM:

m <- glmer(cbind(df$Valid.detections, df$Missed.detections) ~ distance + 
              Habitat + Replicate + transmitter.depth + receiver.depth + 
              wind.speed + wtc + Transmitter + (1 | Unit) + 
              (1 | SUR.ID) + distance:Transmitter + 
              distance:Habitat + distance:transmitter.depth + distance:receiver.depth + 
              distance:wind.speed, data = df, family = binomial(link=logit),control=glmerControl(calc.derivs=F))

I found that the confint() function is able to achieve this, so I specified the function:

confint(m, method = "boot", boot.type = "basic", seed = 123, nsim = 1000)

The function had been running for more than 8 hours before I decided to terminate. Upon termination, I got returned the following warning messages (x10):

Warning messages:
1: In (function (fn, par, lower = rep.int(-Inf, n), upper = rep.int(Inf,  :
  failure to converge in 10000 evaluations

My questions are: 1) Do I have to worry about these warning messages? If so, how could I deal with them?, 2) Because after 8 hours it was still running I have no clue how long it takes to perform this function. Therefore, it would be nice to have some sort of progress bar while performing this function. I read that confint() can take arguments from bootMer, so I included the argument .progress = "txt", resulting in:

confint(m, method = "boot", boot.type = "basic", seed = 123, nsim = 1000, .progress = "txt")

but it doesn't seem to work. Alternatively, if there are better ways to achieve the same goal, I'm open to suggestions.

Thanks for any help

FlyingDutch
  • 1,100
  • 2
  • 14
  • 24
  • what does "doesn't seem to work" in your second-to-last sentence mean? You need to load the `plyr` package -- sorry if that's not clear. Profile confidence intervals would be faster, and might be sufficiently accurate. – Ben Bolker Oct 28 '14 at 16:53
  • You could also use the `parm` parameter to restrict which parameters you're profiling (i.e. skip the random-effects parameters ...) – Ben Bolker Oct 28 '14 at 17:16
  • you could increase the maximum number of iterations (see `?glmerControl`) to get rid of the warnings. You should expect the bootstrapping time to scale approximately linearly with the number of samples (i.e., 1000 bootstrap samples should take on the order of 1000x as long as fitting the model in the first place, although there are some tricks built in to make it a little faster). – Ben Bolker Oct 28 '14 at 17:18
  • oops, I was wrong. You *shouldn't* need to load `plyr` to get the progress bar to work. – Ben Bolker Oct 28 '14 at 17:23
  • how long did your initial fit take? – Ben Bolker Oct 28 '14 at 17:24
  • Well, what doesn't seem to work is getting the progress bar to display, even though I explicitly specify it in the function. My initial fit took 8.5 hours before I terminated it – FlyingDutch Oct 28 '14 at 18:31
  • Alright, I'll try to compute profile confidence intervals instead of bootstrapping and specify the parm argument for only fixed effects. Thank you. – FlyingDutch Oct 28 '14 at 18:34
  • So everything works as expected but the progress bar doesn't appear? – Ben Bolker Oct 28 '14 at 20:04
  • by "initial" I mean your first chunk of code (`m <- glmer(...)`) – Ben Bolker Oct 28 '14 at 21:26
  • Yes, it seems to work but for a very long time. Oh okay, my computer takes 89 secs to run the glmer. – FlyingDutch Oct 29 '14 at 08:51

1 Answers1

7

Here's an example:

library("lme4")
(t1 <- system.time(
    gm1 <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
                 data = cbpp, family = binomial)))
##    user  system elapsed 
##   0.188   0.000   0.186

nranpars <- length(getME(gm1,"theta"))
nfixpars <- length(fixef(gm1))

(t2 <- system.time(c1 <- confint(gm1,method="boot", nsim=1000,
                  parm=(nranpars+1):(nranpars+nfixpars),
                  .progress="txt")))

##    user  system elapsed 
## 221.958   0.164 222.187

Note that this progress bar is only 80 characters long, so it increments only after each 1000/80=12 bootstrap iterations. If your original model took an hour to fit then you shouldn't expect to see any progress-bar activity until 12 hours later ...

(t3 <- system.time(c2 <- confint(gm1,
                  parm=(nranpars+1):(nranpars+nfixpars))))

##    user  system elapsed 
##   5.212   0.012   5.236 

1000 bootstrap reps is probably overkill -- if your model fit is slow, you can probably get reasonable CIs from 200 bootstrap reps.

I tried this with optimizer="nloptwrap" as well, hoping it would speed things up. It did, although there is a drawback (see below).

(t4 <- system.time(
    gm1B <- glmer(cbind(incidence, size - incidence) ~ period + (1 | herd),
                 data = cbpp, family = binomial, 
                 control=glmerControl(optimizer="nloptwrap"))))
##   user  system elapsed 
##  0.064   0.008   0.075 

(t5 <- system.time(c3 <- confint(gm1B,method="boot", nsim=1000,
                  parm=(nranpars+1):(nranpars+nfixpars),
                  .progress="txt",PBargs=list(style=3))))
##
##   user  system elapsed 
## 65.264   2.160  67.504

This is much faster, but gives warnings (37 in this case) about convergence. According to all.equal(), there was only about 2% difference in the confidence intervals calculated this way. (There are still some wrinkles to sort out in the package itself ...)

Your best bet for speeding this up will be to parallelize -- unfortunately, this way you lose the ability to use a progress bar ...

(t6 <- system.time(c4 <- confint(gm1,method="boot", nsim=1000,
                  parm=(nranpars+1):(nranpars+nfixpars),
                  parallel="multicore", ncpus=4)))

## 
##     user  system elapsed 
##  310.355   0.916 116.917

This takes more user time (it counts the time used on all cores), but the elapsed time is cut in half. (It would be nice to do better with 4 cores but twice as fast is still good. These are virtual cores on a virtual Linux machine, real (non-virtual) cores might have better performance.)

With nloptwrap and multicore combined I can get the time down to 91 seconds (user)/ 36 seconds (elapsed).

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • Thank you Ben, I'm trying your proposed codes with only 200 bootstraps instead of a 1000 (the # of bootstraps was based on a paper I read somewhere). – FlyingDutch Oct 29 '14 at 09:41
  • After googling the warning message returned when fitting the model, I came across http://stackoverflow.com/questions/19478686/increase-iterations-for-new-version-of-lmer and adjusted the number of iterations to 20000 and the warnings were gone. – FlyingDutch Oct 29 '14 at 09:59
  • When I run your or my code (it doesn't matter) to fit the model with nloptr optimiser I get returned the following warning message: Warning messages: 1: In data("nloptr.default.options", package = "nloptr", envir = ee) : data set ‘nloptr.default.options’ not found. Any ideas? – FlyingDutch Oct 29 '14 at 11:03
  • I tried to compute profile CIs for my specific model. It indeed works fast, but get returned many (50+) warning messages like:2: In zeta(shiftpar, start = opt[seqpar1][-w]) : NAs detected in profiling 3: In nextpar(mat, cc, i, delta, lowcut, upcut) : Last two rows have identical or NA .zeta values: using minstep – FlyingDutch Oct 29 '14 at 11:21