After some digging (1, 2, 3), there appears to be a few posts about using formulas within functions causing scoping issues if I am understanding them correctly. Some suggests to use an environment, assign, or <<-
to get around the issue, but I've been stumped how to use them (and confused why there's an issue in the first place).
Let's try this toy code:
library(survival)
library(survminer)
set.seed(1)
give_p_val <- function() {
df <- data.frame('OS' = ovarian[, 'futime'], 'Survival_event' = ovarian[, 'fustat'])
subgroup <- sample(nrow(df), nrow(df)/2)
df$Class <- 'A'
df$Class[subgroup] <- 'B'
fit2 <- survfit(Surv(OS, Survival_event) ~ Class, data=df)
return(surv_pvalue(fit2))
}
give_p_val( )
It doesn't work, unless you run it directly, which hints at a scoping issue.
This code will work to return a fitted object:
survfit(Surv(futime, fustat) ~ rx, data=ovarian)
So why does the function break if we copy a dataframe within the scope?
testit<-function(){
ovarian2 <- ovarian
fit2 <- survfit(Surv(futime, fustat) ~ rx, data=ovarian2)
return(surv_pvalue(fit2))
}
testit()
Ultimately, how do I generate a dataframe within a function to be handled correctly by the formula being used? Thanks!