1

I'm running a glmmTMB model with various truncated count distributions (truncated_poisson, truncated_compois, truncated_nbinom1, truncated_nbinom2). When I predict from the model, the values seem to be lower than expected, as if the prediction is not accounting for the truncation. Where am I going wrong? A toy example is provided, showing that predicted values are lower than observed means.

Any advice would be appreciated. Extra points if the advice can extend to the other truncated count distributions (see above) and if it shows how to correctly get the 95% confidence band around the estimated values in these cases.

library(dplyr)
library( extraDistr)
library(glmmTMB)

set.seed(1)
df <- data.frame(Group = rep(c("a", "b"), each = 20), N = rtpois(40, 1, a = 0), ran = "a") %>%
        mutate(N = ifelse(N == 0, 1, N))
m <- glmmTMB(N ~ Group + (1|ran), data = df, family = "truncated_poisson")

df %>% group_by(Group) %>% summarize(mean(N))
predict(m, newdata = data.frame(Group = c("a", "b"), ran = NA), type = "response")
user2602640
  • 640
  • 6
  • 21

1 Answers1

2

I think the main issue is probably that you're using a slightly older version of glmmTMB (< 1.1.5, where a bug was fixed, see e.g. e.g. https://github.com/glmmTMB/glmmTMB/issues/860).

sample data

streamlined slightly (we don't need to include a random effect for this example), and adding a truncated nbinom2.

library(dplyr)
library(extraDistr)
library(glmmTMB)

set.seed(1)
df <- data.frame(Group = rep(c("a", "b"), each = 20),
                 Np = rtpois(40, 1, a = 0))

## clunky trunc nbinom generator
tnb <- rep(0, 40)
z <- (tnb==0)
while(any(z)) {
    tnb[z] <- rnbinom(sum(z), mu = 1, size = 1)
    z <- (tnb==0)
}
df$Nnb <- tnb
## summarize
df %>% group_by(Group) %>% summarize(across(starts_with("N"), mean))
##   Group    Np   Nnb
## 1 a      1.75  1.8 
## 2 b      1.45  2.35

fit models

m1 <- glmmTMB(Np ~ Group, data = df, family = "truncated_poisson")
m2 <- update(m1, Nnb ~ ., family = truncated_nbinom2)

Predicting with se.fit = TRUE will give you standard errors for the predictions, from which you can compute confidence intervals (assuming Normality/Wald intervals/blah blah blah ...) ...

pfun <- function(m, level = 0.95) {
    pp <- predict(m, newdata = data.frame(Group = c("a", "b")),
            type = "response",
            se.fit = TRUE)
    list(est = unname(pp$fit), 
         lwr = unname(pp$fit + qnorm((1-level)/2)*pp$se.fit),
         upr = unname(pp$fit + qnorm((1+level)/2)*pp$se.fit))

}
pfun(m1)
pfun(m2)
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • Well, that was an easier fix than I expected. Thank you! Two follow-ups: 1) I thought that using the SE from response predictions wasn't proper for a glm, and that I'm supposed to predict on the link scale and then back-transform? 2) A while back, I asked a question about the possibility of using glmmTMB without a random effect. Care to post an answer, seeing how it's apparently possible? https://stats.stackexchange.com/questions/511029/can-glmmtmb-be-used-without-a-random-effect?noredirect=1#comment1105924_511029 – user2602640 Jan 05 '23 at 12:52
  • 1
    (1) oops, yes, you're right - it would be better to predict on the link scale and back-transform (the solution above is approximately OK for *large* samples/low uncertainty ...); (2) will post. – Ben Bolker Jan 05 '23 at 13:02
  • How would I go about the back-transformation in the case of truncated distributions? This was my original intent in my question, apologies for not specifying that. Obviously I can't just exponentiate the link, since that gives me the non-truncated Poisson value... – user2602640 Jan 06 '23 at 14:42