1

I want to achieve the exact same thing asked in this question: How to plot the survival curve generated by survreg (package survival of R)?

Except for the fact that I don't want the data to be stratified by a variable (in the question above it was stratified by sex).

I just want the progression free survival for the whole group of treated patients.

So when I copy the code from the other question, here is where I get stuck:

library(survminer)
library(tidyr)

s <- with(lung,Surv(time,status))
fKM <- survfit(s ~ sex,data=lung)
sWei <- survreg(s ~ as.factor(sex),dist='weibull',data=lung)  # in my case here I would replace as.factor(sex) by 1

pred.sex1 = predict(sWei, newdata=list(sex=1),type="quantile",p=seq(.01,.99,by=.01)) #Since I don't want to stratify, what do I do with these 2 lines of code?
pred.sex2 = predict(sWei, newdata=list(sex=2),type="quantile",p=seq(.01,.99,by=.01))

df = data.frame(y=seq(.99,.01,by=-.01), sex1=pred.sex1, sex2=pred.sex2)
df_long = gather(df, key= "sex", value="time", -y)

p = ggsurvplot(fKM, data = lung, risk.table = T)
p$plot = p$plot + geom_line(data=df_long, aes(x=time, y=y, group=sex))

I tried replacing as.factor(sex) by 1 and then the rest of the code just does not make sense, can someone help me with this?

Many thanks in advance!

Karima21
  • 51
  • 1
  • 9

1 Answers1

1

If you just want to plot the overall empirical survival curve, you might do something like this:

library(survival)
library(survminer)
library(tidyr)

s   <- with(lung, Surv(time, status))
fKM <- survfit(s ~ 1, data = survival::lung)
ggsurvplot(fKM, ggtheme = theme_bw())

However, if you want to fit a Weibull model with no predictors, then your formula is fine.

sWei  <- survreg(s ~ 1, dist = 'weibull', data = lung) 
probs <- seq(0.01, 1, by = 0.01)
time  <- predict(sWei, type = "quantile", se = TRUE, p = probs)

The only problem is that time is now a named list of two matrices: fit and se.fit. Both have the same number of rows as lung, but all rows are identical, so we just take one from each and calculate the confidence interval in a data frame which we can then use to create a ggplot:

ggplot(data = data.frame(p     = 1 - probs, 
                         time  = time$fit[1,],
                         upper = time$fit[1,] + 1.96 * time$se.fit[1,],
                         lower = time$fit[1,] - 1.96 * time$se.fit[1,])) + 
  geom_step(aes(p, time, colour = "All"), size = 1) +
  geom_ribbon(aes(p, ymin = lower, ymax = upper, fill = "All"), alpha = 0.2) +
  coord_flip(ylim = c(0, 1000)) +
  scale_fill_discrete(name = "Strata") +
  scale_color_discrete(name = "Strata") +
  theme_bw() +
  theme(legend.position = "top")

enter image description here

Which we can see looks like a pretty good fit.

If you want both in the same plot you can do something like:

df <- data.frame(p     = 1 - probs, 
                 time  = time$fit[1,],
                 upper = time$fit[1,] + 1.96 * time$se.fit[1,],
                 lower = time$fit[1,] - 1.96 * time$se.fit[1,])

ggsurvplot(fKM, ggtheme = theme_bw())$plot +
  geom_line(data = df, aes(time, p), linetype = 2, size = 1) +
  geom_line(data = df, aes(upper, p), linetype = 2, size = 1) +
  geom_line(data = df, aes(lower, p), linetype = 2, size = 1)

enter image description here

Created on 2020-08-18 by the reprex package (v0.3.0)

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
  • Thank you for your help, the code you provided worked. I do have 2 questions though: 1. How do I plot the two curves in the same figure? So both kaplan meier and Weibull curve in the same figure like they did in the question which I have linked above. 2. The time on the x-axis, in what units is that displayed now? When I use your code with my data I need the x-axis to represent time in months and the Y-axis in 0-100% preferably. how do I accomplish that. excuse these questions as I am still a beginner in R... – Karima21 Aug 19 '20 at 09:56