7

The ROCR library in R offer the ability to plot an average ROC curve (right from the ROCR reference manual):

library(ROCR)
library(ROCR)
data(ROCR.xval)
# plot ROC curves for several cross-validation runs (dotted
# in grey), overlaid by the vertical average curve and boxplots
# showing the vertical spread around the average.
data(ROCR.xval)
pred <- prediction(ROCR.xval$predictions, ROCR.xval$labels)
perf <- performance(pred,"tpr","fpr")
plot(perf,col="grey82",lty=3)
plot(perf,lwd=3,avg="vertical",spread.estimate="boxplot",add=TRUE)

Averaged ROC plot with boxplot

Lovely. Unfortunately, there's seemingly no ability to obtain the average ROC curve itself as an object/dataframe/etc. for further statistical testing (say, with pROC). I did do some research (albeit perhaps after the fact), and I found this post:

Global variables in R

I looked through ROCR's code reveals the following lines for passing a result to a plot:

performance_plots.R, (starting at line 451)

## compute average curve
 perf.avg <- perf.sampled
 perf.avg@x.values <- list( rowMeans( data.frame( perf.avg@x.values)))
 perf.avg@y.values <- list(rowMeans( data.frame( perf.avg@y.values)))
 perf.avg@alpha.values <- list( alpha.values )

So, using the trace function I looked up here (General suggestions for debugging in R):

trace(.performance.plot.horizontal.avg, edit=TRUE)

I added the following line to the performance_plots.R after the lines listed above:

perf.rocr.avg <<- perf.avg # note the double `<<`

A horrible hack, yet it works as I can plot perf.rocr.avg without a problem. Unfortunately, when using pROC, I can't compare my averaged ROC curve because it requires a pROC roc object. That's fine, but the catch is that the pROC roc object requires the original prediction and reference data to create. As far as I can tell, ROCR is averaging the ROC curves themselves and not the predictions, so it seems I can't get what I want out of ROCR.

Is there a way to reverse-engineer the predictions from the averaged ROC curve created by ROCR?

Community
  • 1
  • 1
Prophet60091
  • 589
  • 9
  • 23
  • Have you looked to see if the predict command would work with ROC? – Dave2e Apr 25 '16 at 02:42
  • @Dave2e - I have, but I didn't make much headway. I've assigned a variable after the last line above `perf.avg.rocr <<- perf.avg`, which gives me a ROCR `performance` object, and the desired average ROC plot. Unfortunately, I now realize I can't use `roc.test` because it's not a `prediction` object. Any other advice welcomed... – Prophet60091 Apr 26 '16 at 22:44
  • Have you looked at this answer: http://stackoverflow.com/questions/11467855/roc-curve-in-r-using-rocr-package or this https://hopstat.wordpress.com/2014/12/19/a-small-introduction-to-the-rocr-package/ I have not used the ROCR library, so I can't provide much more advice – Dave2e Apr 26 '16 at 22:59
  • 2
    @Dave2e - Ya gotta love how that question on SO has been upvoted 16 times and is entirely RTFM, whereas I ask something programmatic in nature that has me honestly stumped and I get downvoted. Anyway, thanks! I'm (now) pretty versed in the usage of `ROCR`. It's just that it doesn't do what I need it to. To make matters worse, `pROC` only accepts a `roc` object for statistical testing, which itself requires the original prediction and reference data. I'll keep at it on my end. – Prophet60091 Apr 28 '16 at 00:35
  • @Prophet60091 By any chance were you able to figure out a solution? I am looking to be able to extract the data frame for individual ROC curves, so if you could guide me I would appreciate it. – Keshav M Dec 02 '17 at 19:49

1 Answers1

0

I met the same problem as you. In my perspective, the average ROC generated by the ROCR package just assigned numeric values, while other statistical attribution (e.g. confidence interval) lacks. That means statistic with the average ROC may make no sense and that's why the roc object can't be generated by (tpr, fpr) list in PRoc package. However, I find a paper to address this problem, i.e., the comparison between average ROCs. The title is "The average area under correlated receiver operating characteristic curves: a nonparametric approach based on generalized two-sample Wilcoxon statistics". I hope that's helpful.

  • 1
    Actually, I implement the method proposed in that paper, and the result seems reasonable. That's a good choice if you aim to make statistical test between average ROCs. – deserve Apr 17 '20 at 16:07