1

Getting this result when trying to test predictive accuracy of logistic regression model. It doesn't seem right. Any help appreciated!

> dput(head(test$subscribed))
structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = c("no", "yes"), class = 
"factor")

Input

predictions <- predict(final_model, test, type = "response")`
class_pred<- as.factor(ifelse(predictions > .5, "Yes", "No"))
postResample(class_pred, test$subscribed)

Output

 Accuracy    Kappa 
  NA       NA 
JoeD93
  • 13
  • 5
  • can you check if there is any NAs in class_pred. ```table(is.na(class_pred))```. you need to remove the NAs – StupidWolf Dec 14 '20 at 02:10
  • Hi, this is the output of that. FALSE 6907 – JoeD93 Dec 14 '20 at 02:15
  • I cannot reproduce your error. if you just need accuracy, do ```confusionMatric(table(class_pred, test$subscribed))``` – StupidWolf Dec 14 '20 at 02:17
  • what actually is ```final_model```, is it fitted using glm, as you can see, your question is lacking a lot of information. see https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – StupidWolf Dec 14 '20 at 02:19
  • It's confusing me this one, yes it's just for accuracy- I get this error for the confusion matrix Error in confusionMatrix.table(table(class_pred, test$subscribed)) : the table must the same classes in the same order – JoeD93 Dec 14 '20 at 02:20
  • Yes final_model is a logistic regression model, independent variables used to predict binomial yes/no repsonse – JoeD93 Dec 14 '20 at 02:23
  • yeah sure, thats done – JoeD93 Dec 14 '20 at 02:35
  • you did not set the levels correctly. See my answer. – StupidWolf Dec 14 '20 at 02:39

1 Answers1

0

Let's say your data is like this:

df = data.frame(subscribed=sample(c("yes","no"),100,replace=TRUE),
x1 = runif(100),x2=runif(100))

Set the factor correctly:

df$subscribed = factor(df$subscribed,levels=c("no","yes"))

Do the model:

traindf = df[1:70,]
test = df[1:30,]
final_model = glm(subscribed ~ .,data=traindf,family="binomial")

And predict, and set the factors with the same levels, note the levels are case sensitive, using "yes" is different from "Yes" :

predictions <- predict(final_model, test, type = "response")
class_pred<- ifelse(predictions > .5, "yes", "no")
class_pred = factor(class_pred,levels=c("no","yes"))

Then:

confusionMatrix(table(class_pred, test$subscribed))
Confusion Matrix and Statistics

          
class_pred no yes
       no   0   1
       yes 14  15
StupidWolf
  • 45,075
  • 17
  • 40
  • 72