I am an R beginner and have got stuck with this seemingly simple problem. I have a large data frame with 4 columns; id, date of observation, a value (alb) and an end date. A single id may have between 1 and 15 or so observations at different dates. The end date is the time of event or censoring with one per id.
id date alb end
1143 2010-03-23 41 2010-12-15
1143 2010-06-29 39 2010-12-15
1144 2008-01-01 34 2009-08-06
1145 2010-03-23 42 2012-10-25
1145 2011-01-12 45 2012-10-25
For survival analysis using alb as a time varying covariate I am trying to create an episode for each observation with a start and stop time column. I am trying to create a column where the stop time is the start time for the next alb observation or the end time if there is no further alb observation for that id. Like so:
id date alb end start stop
1143 2010-03-23 41 2010-12-15 2010-03-23 2010-06-29
1143 2010-06-29 39 2010-12-15 2010-06-29 2010-12-15
1144 2008-01-01 34 2009-08-06 2008-01-01 2009-08-06
1145 2010-03-23 42 2012-10-25 2010-03-23 2011-01-12
1145 2011-01-12 45 2012-10-25 2011-01-12 2012-10-25
I am getting stuck with creating a column of stop times. I got in a mess trying to make a function with nested if else statements. Does anyone have a simple approach? Thanks in advance!
in reply to r2evans, this is a large portion of the data.frame where some of the values from the dplyr action return 1970-01-01. (the full data frame is about 130,000 rows). Thanks
id date alb end
1143 2010-03-23 41.0 1996-08-10
1143 2010-06-29 39.0 1996-08-10
1143 2011-01-12 42.0 1996-08-10
1143 2010-09-28 47.0 1996-08-10
1143 2011-07-19 40.0 1996-08-10
1143 2012-06-12 41.0 1996-08-10
1143 2013-06-25 40.0 1996-08-10
1143 2013-12-26 40.0 1996-08-10
1143 2014-06-15 40.0 1996-08-10
1143 2014-12-26 39.9 1996-08-10
1144 2008-01-01 34.0 2015-04-28
1145 2010-03-23 42.0 2015-04-28
1145 2012-01-13 44.0 2015-04-28
1145 2012-06-15 41.0 2015-04-28