3

I am using ROCR package and i was wondering how can one plot a ROC curve for knn model in R? Is there any way to plot it all with this package?

I don't know how to use the prediction function of ROCR for knn. Here's my example, i am using isolet dataset from UCI repository where i renamed the class attribute as y:

cl<-factor(isolet_training$y)
knn_isolet<-knn(isolet_training, isolet_testing, cl, k=2, prob=TRUE)

Now my question is, what are the arguments to pass to the prediction function of ROC. I tried the 2 below alternatives which are not working:

library(ROCR)
pred_knn<-prediction(knn_isolet$y, cl)
pred_knn<-prediction(knn_isolet$y, isolet_testing$y)
Backlin
  • 14,612
  • 2
  • 49
  • 81
spektra
  • 407
  • 2
  • 6
  • 9
  • I guess it can be done since the ROCR package is all about visualizing various aspects of classifiers. It would be great if you could provide a toy example where you show how you fit your kNN classifier. – Backlin Jul 31 '12 at 15:05
  • 1
    @Backlin I just added an example. – spektra Aug 01 '12 at 08:59

2 Answers2

6

There's several steps to solve in order to get you a ROC curve here. I am just going to make up some data since you did not provide an easy way of getting the data you are using. Note that the ROCR package wants the class labels to be positive/negative, not factors, so let's make them like that.

# Generate fake data
isolet_training <- sweep(matrix(rnorm(400), 40, 10), 1, rep(0:1, each=20))
isolet_testing <- sweep(matrix(rnorm(400), 40, 10), 1, rep(0:1, each=20))
# Generate class labels
cl <- cl_testing <- rep(c(-1, 1), each=20)

You can now train your knn and obtain its class probabilities from the "prob" attribute.

knn_isolet <- class::knn(isolet_training, isolet_testing, cl, k=2, prob=TRUE)
prob <- attr(knn_isolet, "prob")
# you can probably use just `knn` instead of `class::knn`,
# but for some reason it did not work for me.

However, they come on a form that ROCR does not accept so we need to invert them for the -1 class and rescale them.

prob <- 2*ifelse(knn_isolet == "-1", 1-prob, prob) - 1

Now you can feed the "probabilities" into the ROCR package's functions and obtain a ROC curve.

pred_knn <- prediction(prob, cl_testing)
pred_knn <- performance(pred_knn, "tpr", "fpr")
plot(pred_knn, avg= "threshold", colorize=T, lwd=3, main="Voilà, a ROC curve!")

enter image description here

Backlin
  • 14,612
  • 2
  • 49
  • 81
  • If the vector to be provided in the `predictions` argument must contain the positive category probabilities of each case, why `ifelse(knn_isolet == "-1", 1-prob, prob)` alone is not enough to obtain such probabilities? – ForEverNewbie Oct 21 '20 at 16:16
0

pred_knn<-prediction(knn_isolet$y, isolet_testing$y)

This line would work just fine, but according to the documentation, both the arguments must be vectors.

So first do:

knn_isolet$y <- as.vector(knn_isolet$y, mode = "numeric")

isolet_testing$y <- as.vector(isolet_testing$y, mode = "numeric")

Note: ROCR only supports binary classification. So check if the levels in 'knn_isolet$y' and 'isolet_testing$y' have the same labels.