Background
I've got this R
dataframe, df
:
df <- data.frame(ID = c("a","a","a","b", "c","c","c"),
event = c(0,1,0,0,0,1,0),
event_date = as.Date(c("2011-01-01","2011-08-21","2011-12-23","2014-12-31","2013-03-14","2013-04-07","2014-07-14")),
stringsAsFactors=FALSE)
It looks like this:
As you can see, it's got three people (ID
's), along with a 0/1 indicator of whether an event
of interest occurs, and an event_date
when that event occurred.
The Problem / Question
I want to edit the event
column so that for each ID
where any row has event = 1
, all rows chronologically (by event_date
) after the first event = 1
are also marked event = 1
.
In other words, I'd like something that looks like this:
df_want <- data.frame(ID = c("a","a","a","b", "c","c","c"),
event = c(0,1,1,0,0,1,1),
event_date = as.Date(c("2011-01-01","2011-08-21","2011-12-23","2014-12-31","2013-03-14","2013-04-07","2014-07-14")),
stringsAsFactors=FALSE)
Which would look like so:
As you can see, for ID
's a
and c
, the event column now shows event = 1
for that ID's rows after the date of the first event = 1
. For ID b
, nothing happens, as they do not have any event = 1
.
Note that in the "real" dataset I'm doing this procedure on, ID's generally have many more rows after their first event = 1
, so a solution would need to apply to all of them. (I add this caveat since ID
's a
and c
only have one row after their event = 1
.)
What I've tried
I have some code that will apply event = 1
to everything, but that obviously doesn't get me very far:
df <- df %>%
mutate(exposed = 1)
Any help is greatly appreciated.