2

Here is a toy competing-risks survival problem:

library(randomForestSRC)
library(data.table)

set.seed(10)

N = 500
d = data.table(
    x1 = runif(N, 100, 150),
    x2 = runif(N, 100, 150))
d[, e1.time := x1 + runif(N, -2, 2)]
d[, e2.time := x2 + runif(N, -2, 2)]
d[, survival.time := pmin(e1.time, e2.time)]
d[, status := ifelse(e1.time < e2.time, 1L, 2L)]

m = rfsrc(
    Surv(survival.time, status) ~ x1 + x2,
    data = d,
    nsplit = 3, ntree = 100)

pred = predict(m)

Now I want to make predictions for each case. (For this example, let's put aside issues of overfitting and just make predictions on the same data the model is trained with, ignoring in- versus out-of-bag distinctions.) If I understand correctly, I can predict which event case i will end up with by comparing pred$cif[i, dim(p$cif)[2], 1] to pred$cif[i, dim(pred$cif)[2], 2], which give the probability of each event. But I don't see how to predict

  • The expected time until case i reaches any event
  • The expected time until case i reaches some event e, conditional on no other event happening first

I initially thought that pred$predicted had expected survival times, but they're on a totally different scale from the survival times in the data, so apparently not.

Kodiologist
  • 2,984
  • 18
  • 33

0 Answers0