1

I am working on a project for a client that want their charts done in Excel.

One of the charts I need to do is of cumulative hazards, which I get with Survival::survfit,

My problem is that excel can't do stepwise charts, so I need to transform the data, so every timepoint occurs twice; once with the previous cumulative hazard and once with the cumulative hazard at that time-point.

It is relatively easy, but annoying and time-consuming, to do this in Excel - Is there a smart way of doing it in R?

I am a relatively new r user, and I have not been able to figure out a way to do what I want.

I have tried showing what I get and what I want below:

#Load survival package"
library(survival)

#Create survfit object
Survival_Function <- survfit(Surv(lung$time,
                                 lung$status == 2)~1)

#extract cumulative hazards
cumhaz <- data.frame(Survival_Function$time, Survival_Function$cumhaz)

head(cumhaz)

Gives me the following:

  Survival_Function.time Survival_Function.cumhaz
1                      5              0.004385965
2                     11              0.017601824
3                     12              0.022066110
4                     13              0.031034720
5                     15              0.035559606
6                     26              0.040105061

But for excel to make the charts properly I'd need it to look like this:

  Survival_Function.time Survival_Function.cumhaz
1                      5              0.004385965
2                     11              0.004385965
3                     11              0.017601824
4                     12              0.017601824
5                     12              0.022066110
6                     13              0.022066110
7                     13              0.031034720
8                     15              0.031034720
9                     15              0.035559606
10                    26              0.035559606
11                    26              0.040105061
  • Welcome to SO! Please have a look [how to make a good reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example), i.e. provide an example of the data you have and the desired output format. Then it's easier to help you, thanks! – starja Nov 14 '22 at 11:18
  • Thanks! I have tried to update the question! – Asger Paludan-Müller Nov 14 '22 at 11:49

1 Answers1

1

Based on your code, one simple approach is to repeat the columns, with each element repeated twice. From here, you can remove the first element from the time column and the last from cumhaz column then combine. An example of this code is:

x <- data.frame(
    Time = c(1,2,3,4,5), 
    Hazard = c(6,7,8,9,10)
)

 data.frame(
     Time = rep(x$Time, each = 2)[-1], #Repeats the time, removing the first to give you the desired formatting
     Hazard = rep(x$Hazard, each = 2)[-length(rep(x$Hazard, each = 2))] # By removing the last element is means that they have the same length
)

and this gives you the desired output.

Note: If you have a large amount of columns this will be cumbersome, however for just two or so it should be fine.

Beavis
  • 476
  • 3
  • 13