I have a panel data set for which I would like to create a counter that increases with each step in the panel but restarts whenever some condition occurs. In my case, I'm using country-year data and want to count the passage of years between an event. Here's a toy data set with the key features of my real one:
df <- data.frame(country = rep(c("A","B"), each=5), year=rep(2000:2004, times=2), event=c(0,0,1,0,0,1,0,0,1,0), stringsAsFactors=FALSE)
What I'm looking to do is to create a counter that is keyed to df$event
within each country's series of observations. The clock starts at 1 when we start observing each country; it increases by 1 with the passage of each year; and it restarts at 1 whenever df$event==1
. The desired output is this:
country year event clock
1 A 2000 0 1
2 A 2001 0 2
3 A 2002 1 1
4 A 2003 0 2
5 A 2004 0 3
6 B 2000 1 1
7 B 2001 0 2
8 B 2002 0 3
9 B 2003 1 1
10 B 2004 0 2
I have tried using getanID
from splitstackshape
and a few variations of if
and ifelse
but have failed so far to get the desired result.
I'm already using dplyr
in the scripts where I need to do this, so I would prefer a solution that uses it or base R, but I would be grateful for anything that works. My data sets are not massive, so speed is not critical, but efficiency is always a plus.