0

I'm running some Surv() functions, and one thing I do not like, or understand, is why this function does not take a "data=" argument. This is annoying because I want to perform the same Surv() function on the same data frame but filtered by different criteria each time.

So for example, my data frame is called "ikt" and I want to filter by "donor_type2=='LD'" and also use a strata variable "plan 2". I tried the following but it didn't work:

library(survival)
library(dplyr)

ikt<-data.frame(organ_yrs=(seq(1,20)),
           organ_status=rep(c(0,0,1,1),each=5),
           plan2=rep(c('A','B','A','B'),each=5),
           donor_type2=rep(c('LD','DD'),each=10) )

organ_surv_func<-function(data,criteria,strata) {
data2<-filter(data,criteria)
Surv(data2$organ_yrs,data2$organ_status)~data2$strata
}

organ_surv_func(ikt,donor_type2=='LD',plan2)

Error in filter_impl(.data, quo) : object 'donor_type2' not found

I'm coming from a SAS background so that's probably why I'm thinking this should work and it doesn't...

I looked up something about sapply(), but I don't think that works when the function doesn't have the data= option.

Also the reason I need the Surv() object and not just survfit(Surv()) (which would let me use data=) is because I'm also using survdiff() for log-rank tests, which takes in the Surv() object as it's main argument:

lr<-function (surv) {
round(1-pchisq(survdiff(surv)$chisq,length(survfit(surv)$strata)-1),3)
}

Thanks for any help you can provide.

IRTFM
  • 258,963
  • 21
  • 364
  • 487
Scott Jackson
  • 411
  • 1
  • 4
  • 11
  • When asking for help, you should include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Be clear where the `Surv()` function is coming from (that is not a base R function). – MrFlick Aug 02 '18 at 15:54
  • okay I added that in – Scott Jackson Aug 02 '18 at 16:06
  • `Surv()` is usually used in in formulas to describe relationships between variables. It doesn't usually come along with data. How exactly are you trying to use this with `survdiff`? But really your current problem (error message) is about how you are trying to pass parameters to your `dplyr` function. Check out the [programming with dplyr vignette](https://cran.r-project.org/web/packages/dplyr/vignettes/programming.html). SAS is very different from R. SAS is more of a MACRO language but R requires proper functional programming techniques. – MrFlick Aug 02 '18 at 16:10
  • survdiff() takes in Surv(), and I'm using that as part of a function to generate the log-rank p-values. see above – Scott Jackson Aug 02 '18 at 16:11
  • 2
    According to the docs, `survdiff` takes a formula and it has a `data=` parameter. You should not be passing a the result of `Surv()` directly to `survdiff()` Maybe check out the help page. (Note the `~` in all the examples which indicate a formula.) – MrFlick Aug 02 '18 at 16:14
  • 1
    Ah I see, I can just add the data argument to survdiff within my function. Thank you so much! – Scott Jackson Aug 02 '18 at 16:17

1 Answers1

1

I'm writing this "answer" to caution you against proceeding down the path you seem to be following. The Surv function is really intended to be used as the LHS of a formula defined within one of the survival package functions. You should avoid using constructions like:

Surv(data2$organ_yrs,data2$organ_status)~data2$strata

For one thing it's needlessly verbose, but more importantly, it will prevent the use of predict when it comes time to match up names to formals. The survdiff and the other survival functions all have both a "data" argument as well as a "subset" argument. The subset function should allow you to avoid using filter.

 organ_surv_func<-function(data, covar) {
      form = as.formula(substitute( Surv(organ_yrs, organ_status) ~ covar, list(covar=covar) ) )
      survdiff(form, data=data)
}
# although I think running surdiff in a for-loop might be easier,
# as it would involve fewer tricky language constructs
organ_surv_func(  subset(ikt, (donor_type2=='LD')), covar=quote(plan2))

If you assign the output of survfit to a named variable, you will be able to more economically access chisq and strata:

myfit <- organ_surv_func(  subset(ikt, (donor_type2=='LD')), covar=quote(plan2))
my.lr.test<-function (myfit) {
                round(1-pchisq(myfit$chisq, length(myfit$strata)-1), 3)
                              }
my.lr.test(myfit) # not going to be useful with that dataset.
IRTFM
  • 258,963
  • 21
  • 364
  • 487