3

I would like to calculate the optimal cut-off value, in my case the intersection of maximum sensitivity and specificity to define a decision rule for a logistic regression classification approach. Looking for a solution in stack overflow I found a suggested solution to calculate cut-off that max sensitivity vs specificity using ROCR.

However, when I'm plotting the specificity and the sensitivity values (y-axis) on a joint scale as a function of cut-off values (x-values) of my prediction-object (calculated by the eRm package) with the ROCR package, I got the following figure (see below).

Now, if I calculate the point of intersection of both functions, where specificity and sensitivity were maximized, as suggested in the prior thread, I got a value that lies elsewhere next to the point, which I would visually detect as point of intersection.

My question is rather simple: Could someone show me a way to calculate the point of intersection of both functions to get an 'optimal' cut-off point in R?

Figure 1: Example Plot of sensitivity and specificity as a function of probability cutoff. The line indicates the 'optimal' cutoff value deviating from the visually detected optimal threshold. The calculation of the 'optimal' cutoff value has been done as suggested in an earlier stack overflow thread

library(ROCR)
library(eRm)
set.seed(1)
data <- sim.rasch(30, 300) # simulate Rasch homogenous data 
model.RM<-RM(data, se=T)#estimate Rasch model
PPAR.X <-person.parameter(model.RM)
#Goodness-of-fit test (see Mair et al. 2008)
gof.model.RM<-gofIRT(PPAR.X)
#summary(gof.model.RM) 

#ROCR
pred.model.RM <- gof.model.RM$predobj
Sens.model.RM <- performance(pred.model.RM,  measure="sens", x.measure="cutoff")
Spec.model.RM <- performance(pred.model.RM,  measure="spec", x.measure="cutoff")

#Identify the 'optimal' cutoff that yields the highest sensitivity and specificity according to prior stack overflow thread:
SensSpec.model.RM <- performance(pred.model.RM,  "sens", "spec")
CP<-SensSpec.model.RM@alpha.values[[1]][which.max(SensSpec.model.RM@x.values[[1]]+SensSpec.model.RM@y.values[[1]])]
# [1] 0.5453864 # 'optimal' cutoff value

#Plot
plot(Sens.model.RM, type="l", col="red",xlab="",ylab="")
par(new=TRUE)
plot(Spec.model.RM, type="l", col="blue", xlab="Probability cutoff (threshold)",ylab="Sensitivity/Specificity")
abline(v = CP, col = "black", lty = 3)#add a line indicating the suggested 'optimal' cutoff value differing from the visually expected one
Community
  • 1
  • 1
Fritz
  • 57
  • 1
  • 5
  • When asking for help on this site, you should include a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input data so different methods can be tested. – MrFlick Mar 01 '16 at 19:30
  • @MrFlick Thanks for the advice. A working example has been added. – Fritz Mar 01 '16 at 23:06
  • @MrFlick Now, the example works and should run properly (with the package version mentioned above: eRm_0.15-6, ROCR_1.0-7). I have tested it level times. The line causing errors has been removed. However, the problem still exist. The 'optimal' cut point should be the point where the two curves cross. However, the deviation exist… – Fritz Mar 02 '16 at 09:06
  • Optimal is defined in my context as the identification of the "cutoff that yields the highest sensitivity plus specificity" as mentioned in an earlier thread as described above. This point should be the point of intersection. The solution shown in figure one shows that the 'optimal' cutoff point as suggested by the calculation leads to a higher specificity at cost of sensitivity. Interestingly, when changing the seed, sometimes the calculated result is marking the intersection point. Following from that, my question is rather simple: Why? – Fritz Mar 02 '16 at 19:04
  • …And the second question is: How can I calculate the cutoff value corresponding to the intersection point? It's not my question, of wether this definition is perfect or not or wether there could be another definition of 'optimal' decision (there really exist a lot of definitions). I only would like to understand the problem of deviation and the fact, that this deviation exists in some but not all cases. – Fritz Mar 02 '16 at 19:08

1 Answers1

4

If you want to find the largest sum, you can do

best.sum <- which.max(Sens.model.RM@y.values[[1]]+Spec.model.RM@y.values[[1]])
Sens.model.RM@x.values[[1]][best.sum]
# [1] 0.5453863

If you want to find the closest intersection, you can do

both.eq <- which.min(abs(Sens.model.RM@y.values[[1]]-Spec.model.RM@y.values[[1]]))
Sens.model.RM@x.values[[1]][both.eq]
# [1] 0.5380422
MrFlick
  • 195,160
  • 17
  • 277
  • 295