I am working on an example from http://r-statistics.co/Logistic-Regression-With-R.html. I have problem with smbinning code. I am trying to get Information Value via using smbinning.
library(smbinning)
# segregate continuous and factor variables
factor_vars <- c ("WORKCLASS", "EDUCATION", "MARITALSTATUS", "OCCUPATION", "RELATIONSHIP", "RACE", "SEX", "NATIVECOUNTRY")
continuous_vars <- c("AGE", "FNLWGT","EDUCATIONNUM", "HOURSPERWEEK", "CAPITALGAIN", "CAPITALLOSS")
iv_df <- data.frame(VARS=c(factor_vars, continuous_vars), IV=numeric(14)) # init for IV results
# compute IV for categoricals
for(factor_var in factor_vars){
smb <- smbinning.factor(trainingData, y="ABOVE50K", x=factor_var) # WOE table
if(class(smb) != "character"){ # heck if some error occured
iv_df[iv_df$VARS == factor_var, "IV"] <- smb$iv
}
}
This is the code given. I cannot understand the reason behind checking class of the smbinning. My general understanding on smbinning is also not that good.
for(vars in factor_vars){
smb <- smbinning.factor(trainingData, y = "ABOVE50K", x = vars )
iv_df[iv_df$VARS == vars, "IV"] <- smb["iv"]
}
When I run this code I am getting some values NA values. So class checking is apparently needed but why?
Thank you very much.