Model Fit statistics for a Logistic Regression

Question

I'm running a logistic regression model in R. I've used both the Zelig and Car packages. However, I'm wondering if there is a simple way to get the model fit statistics for the model. (pseudo R-square, chi-square, log liklihood,etc)

You can find a few examples here: http://www.ats.ucla.edu/stat/r/dae/mlogit.htm — nico, Nov 21 '10 at 11:51
Looks like you got an answer you liked (below), would you be willing to select it as your preferred answer? — David J., Jan 03 '11 at 01:33
You might find that this Q&A Site is better for stats questions: http://stats.stackexchange.com/ — David J., Jan 03 '11 at 01:36

score 4 · Answer 1 · edited Jun 25 '19 at 02:34

4

Assume glm1 ist your model and your samplesize is n = 100.

Here are some goodness-of-fit-measures:

R2 <- 1 - ((glm1$deviance/-2)/(glm1$null.deviance/-2))
cat("mcFadden R2 = ", R2, "\n")

R2 <- 1 - exp((glm1$deviance - glm1$null.deviance)/2 * n)
cat("Cox-Snell R2 = ", R2, "\n")

R2 <- R2/(1 - exp((-glm1$null.deviance)/n))
cat("Nagelkerke R2 = ", R2, "\n")

AIC <- glm1$deviance + 2 * 2
cat("AIC = ", AIC, "\n")

In this way you have an overview of how calculating the GoF-Measurements.

edited Jun 25 '19 at 02:34

OTStats

1,820
1
13
22

answered Apr 11 '13 at 13:21

Redfood

146
1
5

Just a short addition on this topic: The GoF-Measurements depend on the Loglikelihood-value, why they are treated not like "normal" R-Squares. A McFadden value of 0.2 doesn't mean, that 20 % of the variance is explained by the model, so it is not the same as R-Square calculated by OLS. But in most of the models a value of Pseudo R-Squares >= 0.2 is quiet good. – Redfood Apr 11 '13 at 13:44

score 1 · Answer 2 · answered Jul 26 '10 at 16:56

1

Typically this is done using the summary() function.

answered Jul 26 '10 at 16:56

Shane

98,550
35
224
217

summary() provides me with the coefficients and regression parameters. That's important, but not what I'm looking for. Furthermore, with the Zelig output, I get the following output: Null deviance: 1068.24 on 772 degrees of freedom Residual deviance: 939.48 on 761 degrees of freedom (941 observations deleted due to missingness) AIC: 963.48 – Tony Jul 26 '10 at 17:04
1

thanks!!! I also found that running the logistic regression using the lrm function from the Design package gives the pseudo-R^2 as an output. – Tony Jul 26 '10 at 19:02

score 1 · Answer 3 · answered Jul 26 '10 at 17:49

It's hard to answer this question without knowing what the model object is. I'm not sure what Zelig produces.

I would look at names(model), names(summary(model)) or names(anova(model,test = "Chisq")) to see if the stats you want are there. I know that for log-likelihood, logLik(model) will give you what you want.

score 1 · Answer 4 · answered Jul 29 '10 at 19:27

While I'm no expert, model fit statistics for logistics regression models are not as straightforward in their interpretation as those in linear regression. Assuming you have a binary response, one method I've found useful is to group your data by predicted probability interval (0-10%, 10%-20%,....90%-100%) and comparing the actual probabilities to the predicted ones. This is very helpful because often your model will over predict at the low end or under predict at the high end. This may lead to a better model as well.

is this not just the Hosmer-Lemeshow test for GOF in logit ? — Nneka, Aug 25 '18 at 18:50

score 1 · Answer 5 · answered Jun 24 '19 at 19:21

have a look at the pscl package. Be careful however, with missing data:

library("MASS","pscl")

admit_2 <- admit
admit_2$gre.quant[sample(1:106, 45)] <- NA

m0 <- MASS::polr(score ~ gre.quant + gre.verbal + ap + pt + female,
              Hess=TRUE,
              data=admit_2,
              method="probit")

m1 <- MASS::polr(score ~ gre.quant + gre.verbal + ap + pt + female,
             Hess=TRUE,
             data= na.omit(admit_2),
             method="probit")

pR2(m0)
     llh      llhNull           G2     McFadden         r2ML         r2CU 
 -57.4666891 -151.0299826  187.1265870    0.6195015    0.9534696    0.9602592 

pR2(m1)
    llh     llhNull          G2    McFadden        r2ML        r2CU 
-57.4666891 -83.3891852  51.8449922   0.3108616   0.5725500   0.6123230

Also, have a look here: https://stats.stackexchange.com/questions/8511/how-to-calculate-pseudo-r2-from-rs-logistic-regression

Model Fit statistics for a Logistic Regression

5 Answers5