I would like to use mlr to run xgboost on right-censored survival data in R. The xgboost code lists an objective function survival:cox which says:
survival:cox: Cox regression for right censored survival time data (negative values are considered right censored).
Mlr 2 ,which I am using, only supports xgboost for regression and classification learners. If I try to use the built-in regression learner for xgboost, it uses mse as the evaluation metric. So I tried changing the metric to cindex and got the error
Measures: cindex cindex
Error in FUN(X[[i]], ...) : Measure cindex does not support task type regr!
So then I tried to write a new survival learner for xgboost, which is just a copy of the regression learner but with "Regr" changed to "Surv", but of course it expects the target to have 2 columns - time and status - and doesn't accept negative times, whereas xgboost expects only one column - time - and assumes that any rows with a negative value for time are censored.
Below is what I have tried. Is there any way to achieve this in mlr2 or mlr3?
- Using built-in regression learner for xgboost:
data(veteran)
veteran_xgb <- veteran
veteran_xgb <- veteran_xgb[c("trt", "karno", "diagtime", "age", "prior", "time")]
veteran_xgb$time <- ifelse(veteran$status==1, veteran$time, -veteran$time)
xgb.task <- makeRegrTask(id="XGBOOST_VET", data = veteran_xgb, target="time")
xgb_learner <- makeLearner(id="xgboost",
cl="regr.xgboost",
predict.type = "response",
par.vals = list(
objective = "survival:cox",
eval_metric = "cox-nloglik",
nrounds = 200
)
)
learners = list(xgb_learner)
outer = makeResampleDesc("CV", iters=5) # Benchmarking
bmr = benchmark(learners, xgb.task, outer, show.info = TRUE)
- Using custom surv learner for xgboost:
data(veteran)
veteran_xgb <- veteran
veteran_xgb <- veteran_xgb[c("trt", "karno", "diagtime", "age", "prior", "time", "status")]
veteran_xgb$time <- ifelse(veteran$status==1, veteran$time, -veteran$time)
xgb.task <- makeSurvTask(id="XGBOOST_VET", data = veteran_xgb, target = c("time", "status"))
xgb_learner <- makeLearner(id="xgboost",
cl="surv.xgboost",
predict.type = "response",
par.vals = list(
objective = "survival:cox",
eval_metric = "cox-nloglik",
nrounds = 200
)
)
learners = list(xgb_learner)
outer = makeResampleDesc("CV", iters=5) # Benchmarking
surv.measures = list(cindex)
bmr = benchmark(learners, xgb.task, outer, surv.measures, show.info = TRUE)
The file RLearner_surv_xgboost.R can be downloaded from OneDrive here https://1drv.ms/u/s!AjTjdzp0sDJRrhZtZF5-HZF2BrBB?e=FNLS94