3

I am building a few logistic regression models and find myself using the varImp('model name') function from the caret package. This function has been useful, but I would prefer that the variable importance be returned sorted from most important to least important.

Here is a reproducible example:

library(caret)
data("GermanCredit")

Train <- createDataPartition(GermanCredit$Class, p=0.6, list=FALSE)
training <- GermanCredit[ Train, ]
testing <- GermanCredit[ -Train, ]

mod_fit <- glm(Class ~ Age + ForeignWorker + Property.RealEstate +Housing.Own + CreditHistory.Critical, data=training, family=binomial(link = 'logit'))

When I use the code:

varImp(mod_fit)

It returns:

                        Overall
Age                    1.747346
ForeignWorker          1.612483
Property.RealEstate    2.715444
Housing.Own            2.066314
CreditHistory.Critical 3.944768

I want to sort by the "Overall" column like this:

sort(varImp(mod_fit)$Overall)

It returns:

[1] 1.612483 1.747346 2.066314 2.715444 3.944768

Is there a way to return the variable name and level of importance together sorted in a descending order?

Thank you in advance.

Aaron England
  • 1,223
  • 1
  • 14
  • 26
  • When asking for help, you should include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Jun 12 '18 at 15:09

2 Answers2

3
library(caret)
data("GermanCredit")

Train <- createDataPartition(GermanCredit$Class, p=0.6, list=FALSE)
training <- GermanCredit[ Train, ]
testing <- GermanCredit[ -Train, ]

mod_fit <- glm(Class ~ Age + ForeignWorker + Property.RealEstate +Housing.Own + CreditHistory.Critical, data=training, family=binomial(link = 'logit'))

imp <- as.data.frame(varImp(mod_fit))
imp <- data.frame(overall = imp$Overall,
           names   = rownames(imp))
imp[order(imp$overall,decreasing = T),]
    overall                  names
 3.9234999 CreditHistory.Critical
 3.1402835            Housing.Own
 2.1955440                    Age
 1.3042088          ForeignWorker
 0.4878837    Property.RealEstate
Hack-R
  • 22,422
  • 14
  • 75
  • 131
-1

Usually you would be able to do:

varImp(mod_fit, scale = TRUE)

And that would scale and order the relative importance on a scale from 0 to 100.