Why does glm.nb throw a "missing value" error only on very specific inputs

Question

glm.nb throws an unusual error on certain inputs. While there are a variety of values that cause this error, changing the input even very slightly can prevent the error.

A reproducible example:

set.seed(11)
pop <- rnbinom(n=1000,size=1,mu=0.05)
glm.nb(pop~1,maxit=1000)

Running this code throws the error:

Error in while ((it <- it + 1) < limit && abs(del) > eps) { : 
  missing value where TRUE/FALSE needed

At first I assumed that this had something to do with the algorithm not converging. However, I was surprised to find that changing the input even very slightly can prevent the error. For example:

pop[1000] <- pop[1000] + 1
glm.nb(pop~1,maxit=1000)

I've found that it throws this error on 19.4% of the seeds between 1 and 500:

fit.with.seed = function(s) {
    set.seed(s)
    pop <- rnbinom(n=1000, size=1, mu=0.05)
    m = glm.nb(pop~1, maxit=1000)
}

errors = sapply(1:500, function(s) {
    is.null(tryCatch(fit.with.seed(s), error=function(e) NULL))
})

mean(errors)

I've found only one mention of this error anywhere, on a thread with no responses.

What could be causing this error, and how can it be fixed (other than randomly permuting the inputs every time glm.nb throws an error?)

ETA: Setting control=glm.control(maxit=200,trace = 3) finds that the theta.ml algorithm breaks by getting very large, then becoming -Inf, then becoming NaN:

theta.ml: iter67 theta =5.77203e+15
theta.ml: iter68 theta =5.28327e+15
theta.ml: iter69 theta =1.41103e+16
theta.ml: iter70 theta =-Inf
theta.ml: iter71 theta =NaN

This won't solve your problem, but it may speed up the search for someone more knowledgeable: setting `control=glm.control(maxit=200,trace = 3)` sheds some important light on exactly what's going wrong in `theta.ml`. — joran, Jul 31 '12 at 22:56
Can you post your `sessionInfo()`? I can't replicate the error on 2.15.1 on Windows in either 32 or 64-bit. — nograpes, Jul 31 '12 at 23:10
I can replicate with this: `set.seed(11);pop <- rnbinom(n=1000,size=1,mu=0.05);glm.nb(pop~1,maxit=1000)`. Do you Macs get the same error with that code? — nograpes, Jul 31 '12 at 23:18
for what it's worth you may be able to get this to work with `bbmle::mle2(Y~dnbinom(mu=exp(logmu),size=exp(logk),parameters(logmu=~ X1 + X2 + offset(X3)), start=..., ...)` — Ben Bolker, Jul 31 '12 at 23:35
Thanks, @BenBolker: I'm trying that now. I've edited the question to work with the reproducible example nograpes listed (since it's reproducible between platforms, and also more compact). — David Robinson, Aug 01 '12 at 01:06
I'm also having this trouble, and it really worries me that this isn't completely solved yet. I'm using MASS:glm.nb. I've run my analysis on a mac and windows now, with slightly differing results. It's very unsettling. — Laserhedvig, Apr 26 '20 at 12:28

Ben Bolker · Accepted Answer · 2014-12-10T20:35:46.917

It's a bit crude, but in the past I have been able to work around problems with glm.nb by resorting to straight maximum likelihood estimation (i.e. no clever iterative estimation algorithms as used in glm.nb)

Some poking around/profiling indicates that the MLE for the theta parameter is effectively infinite. I decided to fit it on the inverse scale, so that I could put a boundary at 0 (a fancier version would set up a log-likelihood function that would revert to Poisson at theta=zero, but that would undo the point of trying to come up with a quick, canned solution).

With two of the bad examples given above, this works reasonably well, although it does warn that the parameter fit is on the boundary ...

library(bbmle)
m1 <- mle2(Y~dnbinom(mu=exp(logmu),size=1/invk),
           data=d1,
           parameters=list(logmu~X1+X2+offset(X3)),
           start=list(logmu=0,invk=1),
           method="L-BFGS-B",
           lower=c(rep(-Inf,12),1e-8))

The second example is actually more interesting because it demonstrates numerically that the MLE for theta is essentially infinite even though we have a good-sized data set that is exactly generated from negative binomial deviates (or else I'm confused about something ...)

set.seed(11);pop <- rnbinom(n=1000,size=1,mu=0.05);glm.nb(pop~1,maxit=1000)
m2 <- mle2(pop~dnbinom(mu=exp(logmu),size=1/invk),
           data=data.frame(pop),
           start=list(logmu=0,invk=1),
           method="L-BFGS-B",
           lower=c(-Inf,1e-8))

+1. One thing that worries me is that it gives an `ABNORMAL_TERMINATION_IN_LNSRCH` message). I'm currently investigating how this method compares to `glm.nb` in the cases where `glm.nb` does work. I've also had a surprising amount of success by very slightly altering the inputs in the cases where `glm.nb` breaks (though it disturbs me that I don't know why that works). — David Robinson, Aug 01 '12 at 02:35
you could also try this with other bounded optimizers, e.g. `library(optimx); ... ; mle2(...,optimizer="optimx",method="bobyqa")` — Ben Bolker, Dec 27 '12 at 16:45

stweb · Answer 2 · 2022-07-19T02:40:17.130

Edit: The code and answer has been simplified to one sample, like in the question.

Yes, theta can approach Inf in small samples and sparse data (many zeroes, small mean and large skew). I have found that fitting glm.nb fails when the data are all zeroes and returns:

Error in while ((it <- it + 1) < limit && abs(del) > eps) { : missing value where TRUE/FALSE needed

The following code simulates small samples with a small mean and theta. To prevent the loop from crashing, glm.nb is not fitted when the data are all zeroes.

en1 <- 10
mu1 <- 0.5
size1 <- 0.5
temp <- matrix(nrow=10000, ncol=2)
# theta == Inf is rare so use a large number of reps
for (r in 1:10000){
  dat1 <- rnbinom(n=en1, size=size1, mu=mu1)
  temp[r, 1:2] <- c(mean(dat1), ifelse(max(dat1)!=0, glm.nb(dat1~1)$theta, NA))
}
temp <- as.data.frame(temp)
  names(temp) <- c("mean1","theta1")
  temp[which(is.na(temp$theta1)),]
  # note that it's rare to get all zeroes in the sample
  sum(is.na(temp$theta1))/dim(temp)[1]
# a log scale helps see what's happening
with(temp, plot(mean1, log10(theta1)))
  # estimated thetas should equal size1 = 0.5
  abline(h=log10(0.5), col="red")
  text(2.5, 5, "n1 = n2 = 10", col="red", cex=2, adj=1)
  text(1, 4, "extreme thetas", col="red", cex=2)

See that estimated thetas can be extremely large when the sample size is small (in the first plot below): Lesson learnt: don't expect high quality results from glm.nb for small samples and sparse data; get larger samples (e.g. in the second plot below).

this is nice. Maybe change the labels on your graph so they read `n = 10`, `n = 100` rather than `n1 = n2 = 10` etc. ? — Ben Bolker, Jul 17 '22 at 23:06
Done. Also note that n = 100 is still not perfect. I would be cautious about choosing between glm.nb and glm(family=poisson) based only on the estimated theta. There are better methods for estimating theta. — stweb, Jul 19 '22 at 02:49

Why does glm.nb throw a "missing value" error only on very specific inputs

2 Answers2

Linked