5

I use the multinom() function from the nnet package to run the multinomial logistic regression in R. The nnet package does not include p-value calculation and t-statistic calculation. I found a way to calculate the p-values using the two tailed z-test from this page. To give one example of calculating a test statistic for a multinom logit (not really a t-stat, but an equivalent) I calculate the Wald's statistic:

mm<-multinom(Empst ~ Agegroup + Marst + Education + State, 
             data = temp,weight=Weight)
W <- (summary(mm1)$coefficients)^2/(summary(mm1)$standard.errors)^2

I take the square of a coefficient and divide by the square of the coefficient's standard error. However, the likelihood-ratio test is the preferable measure of a goodness of fit for the logistic regressions. I do not know how to write code that will calculate the likelihood ratio statistic for each coefficient due to the incomplete understanding of the likelihood function. What would be the way to calculate the likelihood-ratio statistic for each coefficient using the output from the multinom() function? Thanks for your help.

Koba
  • 1,514
  • 4
  • 27
  • 48

3 Answers3

3

Let's look at predicting Sepal.Length from the iris dataset using Species (a categorical variable) and Petal.Length (a continuous variable). Let's start by converting our factor variable into multiple binary variables using model.matrix and building our neural network:

library(nnet)
data(iris)
mat <- as.data.frame(model.matrix(~Species+Petal.Length+Sepal.Length, data=iris))
mm <- multinom(Sepal.Length~.+0, data=mat, trace=F)

Now we can run a likelihood ratio test for a variable in our model:

library(lmtest)
lrtest(mm, "Speciesversicolor")
# Likelihood ratio test
# 
# Model 1: Sepal.Length ~ `(Intercept)` + Speciesversicolor + Speciesvirginica + 
#     Petal.Length + 0
# Model 2: Sepal.Length ~ `(Intercept)` + Speciesvirginica + Petal.Length - 
#     1
#   #Df  LogLik  Df  Chisq Pr(>Chisq)
# 1 136 -342.02                      
# 2 102 -346.75 -34 9.4592          1

To run the likelihood ratio test for all your variables, I guess you could just use a loop and run for each variable name. I've extracted just the p-values in this loop.

for (var in mm$coefnames[-1]) {
  print(paste(var, "--", lrtest(mm, var)[[5]][2]))
}
# [1] "Speciesversicolor -- 0.999990077592342"
# [1] "Speciesvirginica -- 0.998742545590864"
# [1] "Petal.Length -- 3.36995663002528e-14"
josliber
  • 43,891
  • 12
  • 98
  • 133
  • thats helpful. However, I need to calculate the likelihood ratio stat for each coefficient which I do not know how to. – Koba Apr 11 '14 at 17:05
  • @Koba I updated to add a likelihood ratio test for a variable. – josliber Apr 11 '14 at 17:39
  • I will accept your answer. But, I probably had to mention, all my variables are categorical. For example, I have a variable Agegroup with 5 age groups(categories). This leads to having 5 dummy variables in the model, ex.g. `multinom(EmploymentStatus ~ Agegroup,...)`. The formula in the call is equivalent to `EmploymentStatus = Agegroup1 + Agegroup2 + Agegroup3 + Agegroup 4 + Agegroup5`. The multinom call gives me output with coefficients for an intercept and each age group. Using your suggestion I get the significance for Agegroup variable in the multinom call, but not the... – Koba Apr 11 '14 at 18:25
  • ...significance for coefficients of each age group. I can update the question and add the output of my multinom call if you are open to pursuing this question further. – Koba Apr 11 '14 at 18:27
  • 1
    @Koba I've updated the example to use a factor variable – josliber Apr 11 '14 at 18:38
  • you are the best man. I will include you answer in my blog post on multinom logit in R. You could leave the example with continuous variable, so your answer would be complete in every way. Thanks. – Koba Apr 11 '14 at 18:42
3

Use the Anova function in the car package for the likelihood-ratio test of each term in your model.

library(nnet)
data(iris)


mm <- multinom(Species ~ ., data=iris, trace=F)

### car package
library(car)
Anova(mm)
William Chiu
  • 388
  • 3
  • 19
2

From the response of @jolisber i extracted a function so anyone can do this and store the values in a df. Well, i stored the full character vector in the df.

likehoodmultinom2 <- function(model_lmm) 
{

  i <- 1
  values<- c("No funciona") 

  for (var in model_lmm$coefnames[-1]) { # Qutiamos el -1 de coefnames para no obener un NA

  values[i] =(paste(var, "--", lrtest(model_lmm, var)[[5]][2]))
  i=i+1

  }
  return (values)
}

However i cant get the first element (variable) p-value. I dont know why. And i cant ignore the [-1] in model_lmm$coefnames. EDITED. I edited i=0 to i=1; forgot that R vectors start at that :D.

Hope this works for everyone :D

EDIT 2

Also did 1 so it can store in a df.

likehoodmultinom_p <- function(model_lmm) 
{

  i <- 1

  variables <-c("No funciona")
  values <- c("No funciona") 


  for (var in model_lmm$coefnames[-1]) { 

  variables[i] =paste(var)
  values[i]= lrtest(model_lmm, var)[[5]][2]
  i=i+1
   ## Contributed to stack at: 
  }
  return (data.frame(variables,values))
}
Galpaccru
  • 57
  • 12