5

I have data called veteran stored in R. I created a survival model and now wish to predict survival probability predictions. For example, what is the probability that a patient with 80 karno value, 10diagtime, age 65 and prior=10 and trt = 2 lives longer than 100 days?

In this case the design matrix is x = (1,0,1,0,80,10,65,10,2)

Here is my code:

library(survival)
attach(veteran)
weibull <- survreg(Surv(time,status)~celltype + karno+diagtime+age+prior+trt ,dist="w")

and here is the output:

enter image description here

Any idea how to predict the survival probabilities?

Günal
  • 751
  • 1
  • 14
  • 29
  • You could check out the function `predict.survreg`, which will allow you to compute survival probabilities. – msoftrain Dec 10 '14 at 19:06
  • Did you try the `predict()` function? To make it easier for others to help you, please provide a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input so we might test out different solutions. Also, you should learn to avoid `attach()` and just use the `data=` parameter in modeling functions. – MrFlick Dec 10 '14 at 19:07
  • @MrFlick, does it make any difference in the results to use attach() in my way? Also, how could I write the code with colored background as you guys do? I used the predict function, but it produces the survival time, not the probabilities. Am I right? – Günal Dec 10 '14 at 19:11
  • 1
    It doesn't make a different in the result, but using `attach()` is not a best practice. How are you defining "survival probabilities"? You usually need some notion of a specific time i would think. – MrFlick Dec 10 '14 at 19:17
  • Above I described a patient. How could I predict the probability that this patients lives longer than 100 days? – Günal Dec 10 '14 at 19:19
  • @Günal. Sorry, you're right. The specific scenario is included in the question and I missed it. – MrFlick Dec 10 '14 at 19:22
  • 1
    http://stat.ethz.ch/R-manual/R-patched/library/survival/html/predict.survreg.html includes different types of prediction. Look at the argument `type` in the `predict` function – Davide Passaretti Dec 10 '14 at 19:23
  • @DavidePassaretti, I used `predict(weibull, data=x)` but this gives the survival times, not the probabilities or am I missing something out? – Günal Dec 10 '14 at 19:32
  • Perhaps this question can help you: http://stackoverflow.com/questions/9151591/how-to-plot-the-survival-curve-generated-by-survreg-package-survival-of-r – MrFlick Dec 10 '14 at 19:48

1 Answers1

6

You can get predict.survreg to produce predicted times of survival for individual cases (to which you will pass values to newdata) with varying quantiles:

 casedat <- list(celltype="smallcell", karno =80, diagtime=10, age= 65 , prior=10 , trt = 2)
 predict(weibull, newdata=casedat,  type="quantile", p=(1:98)/100)
 [1]   1.996036   3.815924   5.585873   7.330350   9.060716  10.783617
 [7]  12.503458  14.223414  15.945909  17.672884  19.405946  21.146470
[13]  22.895661  24.654597  26.424264  28.205575  29.999388  31.806521
[19]  33.627761  35.463874  37.315609  39.183706  41.068901  42.971927
[25]  44.893525  46.834438  48.795420  50.777240  52.780679  54.806537
[31]  56.855637  58.928822  61.026962  63.150956  65.301733  67.480255
[37]  69.687524  71.924578  74.192502  76.492423  78.825521  81.193029
[43]  83.596238  86.036503  88.515246  91.033959  93.594216  96.197674
[49]  98.846083 **101.541291** 104.285254 107.080043 109.927857 112.831032
[55] 115.792052 118.813566 121.898401 125.049578 128.270334 131.564138
[61] 134.934720 138.386096 141.922598 145.548909 149.270101 153.091684
[67] 157.019655 161.060555 165.221547 169.510488 173.936025 178.507710
[73] 183.236126 188.133044 193.211610 198.486566 203.974520 209.694281
[79] 215.667262 221.917991 228.474741 235.370342 242.643219 250.338740
[85] 258.511005 267.225246 276.561118 286.617303 297.518110 309.423232
[91] 322.542621 337.160149 353.673075 372.662027 395.025122 422.263020
[97] 457.180183 506.048094
#asterisks added

You can then figure out which one is greater than the specified time and it looks to be around the 50th percentile, just as one might expect from a homework question.

png(); plot(x=predict(weibull, newdata=casedat,  type="quantile", 
             p=(1:98)/100),  y=(1:98)/100 , type="l") 
dev.off()

enter image description here

IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • thanks for the answer. Just to make sure I have understood you correctly, the probability a patients lives more than 500 is 1/98. IS that right? Can you also please briefly explain what those predictions mean? – Günal Dec 10 '14 at 21:01
  • The quantiles are for the hazard rather than survival. If you plotted 1-p as the y-value you would get the typical estimated survival – IRTFM Dec 10 '14 at 21:12
  • And the quantile fo the time>506 is 98/100 rather than 1/98. – IRTFM Dec 10 '14 at 21:19