1

I have an issue with creating a ROC Curve for my survival tree created by the rpart package. My goal was to evaluate my survival tree through Area Under Curve (AUC) in ROC curve. I had tried many ways to plot a ROC curve but failed. How can I approach my next step the ROC curve plot?

Here is the R code I have so far:

library(survival) 
library("rpart") 
library("partykit") 
library(rattle)
library(rpart.plot)

temp = coxph(Surv(pgtime, pgstat) ~ age+eet+g2+grade+gleason+ploidy, stagec)
newtime = predict(temp, type = 'expected')


fit <- rpart(Surv(pgtime, pgstat) ~ age+eet+g2+grade+gleason+ploidy, data = stagec)

fancyRpartPlot(fit)

tfit <- as.party(fit) #Transfer "rpart" to "party"

predtree<-predict(tfit,newdata=stagec,type="prob") #Prediction

Here is the R code I have tried so far:

1.

library("ROCR")
predROCR <- prediction(predict(tfit, newdata = stagec, type = "prob")[, 2],labels=Surv(stagec$pgtime, stagec$pgstat))

Error in predict(tfit, newdata = stagec, type = "prob")[, 2] : 
  incorrect number of dimensions

It doesn't work. I checked the prediction result of a function of predict() and found that it is an ‘Survival’ object (Code and results are as follows:). I guess this method fails because it not suitable for "Survival" object?

predict(tfit, newdata = stagec, type = "prob")[[1]]
Call: survfit(formula = y ~ 1, weights = w, subset = w > 0)

      n  events  median 0.95LCL 0.95UCL 
     33       1      NA      NA      NA

I try to derive the survival function value of each terminal node and use these value to draw the ROC curve. Is this correct? The ROC curve drawn in this way seems to treat the predicted classification results as continuous variables rather than categorical variables.

Here's the R code I tried:

tree2 = fit
tree2$frame$yval = as.numeric(rownames(tree2$frame))

#Get the survival function value of each sample
Surv_value = data.frame(predict(tree2, newdata=stagec,type = "matrix"))[,1]
Out=data.frame()
Out=cbind(stagec,Surv_value)

#ROC
library(survivalROC)
roc=survivalROC(Stime=Out$pgtime, status=Out$pgstat, marker = Out$Surv_value, predict.time =5, method="KM")

roc$AUC #Get the AUC of ROC plot

#Plot ROC
aucText=c()
par(oma=c(0.5,1,0,1),font.lab=1.5,font.axis=1.5)

plot(roc$FP, roc$TP, type="l", xlim=c(0,1), ylim=c(0,1),col="#f8766d", 
  xlab="False positive rate", ylab="True positive rate",
  lwd = 2, cex.main=1.3, cex.lab=1.2, cex.axis=1.2, font=1.2)

aucText=c(aucText,paste0("498"," (AUC=",sprintf("%.3f",roc$AUC),")"))
legend("bottomright", aucText,lwd=2,bty="n",col=c("#f8766d","#00bfc4","blue","green"))
abline(0,1)
Gin
  • 11
  • 2
  • have you looked at [this](https://stackoverflow.com/questions/30818188/roc-curve-in-r-using-rpart-package)? – Edo Jul 31 '20 at 10:26
  • Can you show us what you have tried already? – Calimo Aug 01 '20 at 05:59
  • Thanks! In the edited question I updated the all methods I tried. @Calimo – Gin Aug 01 '20 at 06:45
  • Thanks! I have tried the method in the post you mentioned. The R code is updated in the edited question, but it still reports an error. I guess it may be because this method is not suitable for ‘survival’ objects? @Edo – Gin Aug 01 '20 at 06:58

0 Answers0