4

I ran a mixed model logistic regression adjusting my model with genetic relationship matrix using an R package known as GMMAT (function: glmmkin()).

My output from the model includes (taken from the user manual):

  • theta: the dispersion parameter estimate [1] and the variance component parameter estimate [2]
  • coefficients: fixed effects parameter estimates (including the intercept).
  • linear.predictors: the linear predictors.
  • fitted.values: fitted mean values on the original scale.
  • Y: a vector of length equal to the sample size for the final working vector.
  • P: the projection matrix with dimensions equal to the sample size.
  • residuals: residuals on the original scale. NOT rescaled by the dispersion parameter.
  • cov: covariance matrix for the fixed effects (including the intercept).
  • converged: a logical indicator for convergence.

I am trying to obtain the log-likelihood in order to compute variance explained. My first instinct was to pull apart the logLik.glm function in order to compute this "by hand" and I got stuck at trying to compute AIC. I used the answer from here.

I did a sanity check with a logistic regression run with stats::glm() where the model1$aic is 4013.232 but using the Stack Overflow answer I found, I obtained 30613.03.

My question is -- does anyone know how to compute log likelihood from a logistic regression by hand using the output that I have listed above in R?

Community
  • 1
  • 1
mkv8
  • 43
  • 3

1 Answers1

3

No statistical insight here, just the solution I see from looking at glm.fit. This only works if you did not specify weights while fitting the models (or if you did, you would need to include those weights in the model object)

get_logLik <- function(s_model, family = binomial(logit)) {
    n <- length(s_model$y)
    wt <- rep(1, n) # or s_model$prior_weights if field exists
    deviance <- sum(family$dev.resids(s_model$y, s_model$fitted.values, wt))
    mod_rank <- sum(!is.na(s_model$coefficients)) # or s_model$rank if field exists

    aic <- family$aic(s_model$y, rep(1, n), s_model$fitted.values, wt, deviance) + 2 * mod_rank
    log_lik <- mod_rank - aic/2
    return(log_lik)
}

For example...

model <- glm(vs ~ mpg, mtcars, family = binomial(logit))
logLik(model)
# 'log Lik.' -12.76667 (df=2)

sparse_model <- model[c("theta", "coefficients", "linear.predictors", "fitted.values", "y", "P", "residuals", "cov", "converged")]
get_logLik(sparse_model)
#[1] -12.76667
Chrisss
  • 3,211
  • 1
  • 16
  • 13