2

I am analysing a set of lifespan data based on a population of animals (C. elegans) and I'm not sure if I have the data set up incorrectly or I'm using the Surv function incorrectly.

I have a table with the number of days since the start and the number of animals alive on each day. I am not tracking individual animals, but the total number. I've tried having the number that died instead but that didn't change the error message I'm getting.

The data I'm using:

data = matrix (c(0,143,2,28,3,126,4,103,6,102,7,100,8,88,9,70,10,51,11,44,13,27,15,10,17,4,18,3,20,2,22,2,24,0), ncol=2, byrow = TRUE)
colnames(data) <- c("Day", "Survival")

The code I have at the moment:

data <- data %>% 
Surv (time = as.numeric("Day"), event = as.numeric("Survival"))

Note: I am using as.numeric because I am importing a CSV file and the column is marked as <dbl>

The full error message I get:

Error in Surv(., time = as.numeric("Day"), event = as.numeric("Survival")) : 
  Start and stop are different lengths
In addition: Warning message:
In Surv(., time = as.numeric("Day"), event = as.numeric("Survival")) :
  NAs introduced by coercion

Any advice is appreciated. Thank you.

Linda
  • 21
  • 2

1 Answers1

1

So you started out with exactly 903 worms? And 143 of them vere dead before you even started timing? And none of them survived past 24 days?

(Assuming the inferences I made from your data are correct ....)

So the Survival column should not be the event values, since the events are either 1 for death or 0 for lost to follow-up (censored) . Since you don't have any censoring, apparently, all the events should be 1. The numbers of deaths should be assigned to the weights argument in survfit or other function.

 data = cbind( as.data.frame(matrix (c(0,143,2,28,3,126,4,103,6,102,7,100,8,88,9,70,10,51,11,44,13,27,15,10,17,4,18,3,20,2,22,2,24,0),
                                     ncol=2, byrow = TRUE)),
               "ones"=1)
 colnames(data) <- c("Day", "Survival", "ones")

fit <- survfit(Surv(time=Day, event=ones)~1, data=data, weights=data$Survival )
png();plot(fit); dev.off()

enter image description here

IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • Actually, the Survival column represents all the worms that are alive and countable. I started out with 143 worms, none of them survived past 24 days, and the censored worms are already removed. I'm seeing from your response I shouldn't do it that way. And...I just noticed the typo in the data I provided, the number alive on day 2 was 128. This definitely helps, I'll play around some more. Thanks. – Linda Mar 28 '19 at 22:57
  • Then you should do a serial difference of the alive items to get to the count of dead items. There's a diff function that would make it easy. You would still need to use those values as weights. – IRTFM Mar 28 '19 at 23:29