4

I am not able to get ROC function to work, I get the error "Predictor must be numeric or ordered".

I've looked through other posts, but nothing solves my problem. Any help is highly appreciated.

"Get data"
flying=dget("https://www.math.ntnu.no/emner/TMA4268/2019v/data/flying.dd")
ctrain=flying$ctrain
ctest=flying$ctest


library(MASS)
fly_qda=qda(diabetes~., data=ctrain)


#Test error is given below:
predict_qda=predict(fly_qda, newdata=ctest, probability=TRUE)
table_qda<-table(ctest$diabetes, predict_qda$class)
error_qda<-1-sum(diag(table_qda))/sum(table_qda)
error_qda

"ROC curve and AUC"
predict_qdatrain<-predict(fly_qda, newdata=ctrain)
roc_qda=roc(response=ctrain$diabetes, predictor= predict_qdatrain$class, plot=TRUE)
plot(roc_qda, col="red", lwd=3, main="ROC curve QDA")
auc_qda<-auc(roc_qda)

I want the plotted ROC curve and AUC

Calimo
  • 7,510
  • 4
  • 39
  • 61
MJ O
  • 111
  • 1
  • 1
  • 8
  • Hi there, just taking a look at this. I get the error - Error in roc(response = ctrain$diabetes, predictor = predict_qdatrain$class, : could not find function "roc". Which package is this function from? – Ollie Perkins Apr 19 '19 at 11:51

3 Answers3

8

As Ollie Perkins explained in his answer, the error you are getting indicates that your are passing something that is not of sortable nature and therefore cannot be used for ROC analysis. In the case of the predict.qda, the class item is a factor with 1s and 0s indicating the class.

Instead of converting the class to an ordered predictor, it is a better idea to use the posterior probabilities instead. Let's use the probability to belong to class 1:

roc_qda <- roc(response = ctrain$diabetes, predictor = predict_qdatrain$posterior[,"1"])
plot(roc_qda, col="red", lwd=3, main="ROC curve QDA")
auc(roc_qda)

This will give you a smoother curve and more classification thresholds to choose from.

ROC curve QDA

Calimo
  • 7,510
  • 4
  • 39
  • 61
3

So assuming you are using the pROC package, I have fixed this below. The error message means that the predictor variable has to either be of type numeric (a floating point number) or an ordered factor (a categorical variable where the order of levels matters). Therefore, in order to calculate the ROC curve from your predict object, I have converted it on the fly below.

Secondly, in your original code, you were predicting onto the original training set. I have changed this to the test data below.

"Get data"

flying=dget("https://www.math.ntnu.no/emner/TMA4268/2019v/data/flying.dd")
ctrain=flying$ctrain
ctest=flying$ctest


library(MASS)
library(pROC)
fly_qda=qda(diabetes~., data=ctrain)


#Test error is given below:
predict_qda=predict(fly_qda, newdata=ctest, probability=TRUE)
table_qda<-table(ctest$diabetes, predict_qda$class)
error_qda<-1-sum(diag(table_qda))/sum(table_qda)
error_qda

"ROC curve and AUC"
predict_qdatrain<-predict(fly_qda, newdata=ctrain)
roc_qda=roc(response=ctrain$diabetes, predictor= factor(predict_qdatrain$class, 
ordered = TRUE), plot=TRUE)
plot(roc_qda, col="red", lwd=3, main="ROC curve QDA")
auc_qda<-auc(roc_qda)
Ollie Perkins
  • 333
  • 1
  • 12
  • What is "a floating point integer"? Also you really shouldn't convert the binary class to an ordered factor; although it might make some sense here with 0 and 1 (you could've converted it to integers too) you'll get a single ROC point which isn't very representative of the underlying model. – Calimo Apr 20 '19 at 07:32
1

I used: "as.numeric" and it works for me.

#ROC-AUC

set.seed(234)

"ROC curve and AUC"

rocX1 =roc(response=testing_FDI$ï..FDIInflow, predictor= as.numeric(FDI_test_pred2))
rocX1
control = 0
case = 1
plot.roc(rocX1, col="red", lwd=3, main="ROC curve fdi")