I'm trying to run a survival analysis for hundreds of genes within a specific cancer type. I have 2 data frames (m2 and m3). m2 includes the sample ID as well as a column for Overall survival (how long the sample has been alive for) and status (if the sample is alive or deceased). In m3, I have one column for sample ID and columns 2:256 are different genes. If each sample has a mutation in this gene it was denoted by 1, if not, it was denoted by 0. I am trying to determine which genes are statistically significant when comparing their role in survival. I am trying to run a for loop to run this survdiff function and generate p-values, but keep getting an error.
for (x in 2:ncol(m3)) {survdiff(Surv(m2$Overall.Survival, m2$Status) ~ x, data = m3)}
The error I keep getting is:
Error in model.frame.default(formula = Surv(m2$Overall.Survival, m2$Status) ~ :
variable lengths differ (found for 'x')