0

I am working with a multivariate model with a Gamma distribution and I would like to make use of the lme4 syntaxis deployed in glmmTMB, however, I have noticed something strange with my model. Apparently, the model easily converges when I use stats::glm, but seems to give an error in the glmmTMB framework. Here is a reproducible example:

d <- data.frame(gamlss.data::plasma) # Sample dataset

m4.1 <- glm(calories ~ fat*fiber, family = Gamma(link = "log"), data = d)    # Dos parámetros con interacción
m4.2 <- glmmTMB(calories ~ fat*fiber, family = Gamma(link = "log"), data = d)    # Dos parámetros con interacción

>Warning message:
In fitTMB(TMBStruc) :
  Model convergence problem; non-positive-definite Hessian matrix. See vignette('troubleshooting')

I guess, the solution might lie on the control parameters, but after looking on the troubleshooting vignette, I am not sure on where to start.

Santi
  • 368
  • 2
  • 15

1 Answers1

0

One solution can be to scale variables (as long as they are numeric).

d <- data.frame(gamlss.data::plasma) # Sample dataset

m4.1 <- glm(calories ~ fat*fiber, family = Gamma(link = "log"), data = d)    
m4.2 <- glmmTMB(calories ~ scale(fat)*scale(fiber), family = Gamma(link = "log"), data = d) 

Here, the second model converges fine, whereas it didn't before. However, note the difference in parameter estimates between the two models:

> summary(m4.1)

Call:
glm(formula = calories ~ fat * fiber, family = Gamma(link = "log"), 
    data = d)

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-0.42031  -0.07605  -0.00425   0.07011   0.60073  

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  6.120e+00  5.115e-02 119.654   <2e-16 ***
fat          1.412e-02  6.693e-04  21.104   <2e-16 ***
fiber        5.108e-02  3.704e-03  13.789   <2e-16 ***
fat:fiber   -4.092e-04  4.476e-05  -9.142   <2e-16 ***

(Dispersion parameter for Gamma family taken to be 0.0177092)

    Null deviance: 40.6486  on 314  degrees of freedom
Residual deviance:  5.4494  on 311  degrees of freedom
AIC: 4307.2

Number of Fisher Scoring iterations: 4
______________________________________________________________
> summary(m4.2)
Family: Gamma  ( log )
Formula:          calories ~ scale(fat) * scale(fiber)
Data: d

     AIC      BIC   logLik deviance df.resid 
  4307.2   4326.0  -2148.6   4297.2      310 


Dispersion estimate for Gamma family (sigma^2): 0.0173 

Conditional model:
                         Estimate Std. Error z value Pr(>|z|)    
(Intercept)              7.458146   0.007736   964.0   <2e-16 ***
scale(fat)               0.300768   0.008122    37.0   <2e-16 ***
scale(fiber)             0.104224   0.007820    13.3   <2e-16 ***
scale(fat):scale(fiber) -0.073786   0.008187    -9.0   <2e-16 ***

This is because the estimates are based on scaled parameters, so they must be interpreted with caution, or 'un-scaled.' See: Understanding `scale` in R for an understanding of what the scale() function does, and see: interpretation of scaled regression coefficients... for a more in-depth understanding of what this means in a model

As a last note, the fact that models converge doesn't mean that they are a good fit.

Dylan_Gomes
  • 2,066
  • 14
  • 29