0

The great tutorial at how to extract formula from coxph model summary in R? shows how to summarize univariate cox ph results for a list of variables; however, this code only works for continuous or binary covariates. If you have a categorical variable such as race, with 3 or more levels, the code fails.

For example, let's randomly assign a race.

race<-c("White", "Black","Other")
lung$race<-sample(race,12,replace = TRUE)

covariates <- c("race","age", "sex",  "ph.karno", "ph.ecog", "wt.loss")
univ_formulas <- sapply(covariates,
                        function(x) as.formula(paste('Surv(time, status)~', x)))

univ_models <- lapply( univ_formulas, function(x){coxph(x, data = lung)})
# Extract data 
univ_results <- lapply(univ_models,
                       function(x){ 
                         x <- summary(x)
                         p.value<-signif(x$wald["pvalue"], digits=2)
                         wald.test<-signif(x$wald["test"], digits=2)
                         beta<-signif(x$coef[1], digits=2);#coeficient beta
                         HR <-signif(x$coef[2], digits=2);#exp(beta)
                         HR.confint.lower <- signif(x$conf.int[,"lower .95"], 2)
                         HR.confint.upper <- signif(x$conf.int[,"upper .95"],2)
                         HR <- paste0(HR, " (", 
                                      HR.confint.lower, "-", HR.confint.upper, ")")
                         res<-c(beta, HR, wald.test, p.value)
                         names(res)<-c("beta", "HR (95% CI for HR)", "wald.test", 
                                       "p.value")
                         return(res)
                         #return(exp(cbind(coef(x),confint(x))))
                       })
res <- t(as.data.frame(univ_results, check.names = FALSE))
as.data.frame(res)

The final output fails because the univariate results for race have a different number of dimensions.

I have found that I can extract the HR for the individual levels in this way:
coef(summary(coxph(Surv(Time,status) ~ Race, data=dataset)))[,2]

However, when I try and include this as an item in the data frame, my code breaks down. I would greatly appreciate any assistance. I have done quite a bit of googling and banging my head against the wall.

Thank you in advance.

tristanape
  • 31
  • 4
  • 1
    It would be easier to help if you create a small reproducible example along with expected output. Read about [how to give a reproducible example](http://stackoverflow.com/questions/5963269). – Ronak Shah Apr 16 '21 at 06:47
  • "breaks down" is very similar in its information content to "doesn't work", conveying very little in the way of specifics to guide advising you. – IRTFM Apr 17 '21 at 22:46
  • Thank you @IRTFM & Ronak Shah. I should have done that from the beginning. Now I have a solution to my own problem. – tristanape Apr 19 '21 at 15:25

1 Answers1

0

I found Nikk Kennedy's solution to be so helpful. Not only did he provide example code, but he turned around and made a package for R! Hats off to the man.

Take the example from above, but first, load in the forestmodel package.

library(forestmodel)
univ_formulas <- sapply(covariates, function(x) as.formula(paste('Surv(time, status)~', x)))

univ_models <- lapply(univ_formulas, function(x){coxph(x, data = lung)})

forest_model(model_list = univ_models,covariates = covariates,merge_models = T)

Tada. Remember, race was randomly assigned so take these results with a grain of salt!

forestmodel output

tristanape
  • 31
  • 4