Imagine I have a dataset with observations for a number of individuals across multiple years. Individuals can be in one of two statuses each year, A or B. I have data for which status each individual was in each year and created a dummy variable Status_change
which is equal to 1 if status in the current year is different from the one last year. So my data currently looks something like:
Individual| Year | Status | Status_change |
-------------------------------------------
1 | 1 | A | NA |
1 | 2 | A | 0 |
1 | 3 | A | 0 |
1 | 4 | B | 1 |
What I want is to create a new variable which measures how long the individual has remained in the same status - let's call it Duration
. In the context of the above example, it would look something like:
Individual| Year | Status | Status_change | Duration |
------------------------------------------------------
1 | 1 | A | NA | 0 |
1 | 2 | A | 0 | 1 |
1 | 3 | A | 0 | 2 |
1 | 4 | B | 1 | 0 |
Essentially, I am looking for a variable which is initially 0 for all individuals in year 1 and grows by 1 unit each period as long as the status remains the same. If the status switches, the variable takes the value 0 again and the whole thing starts over. So far I have attempted:
data%>%
group_by(Individual)%>%
arrange(Year, .by_group = TRUE)%>%
mutate(Duration = ifelse(Year == 1, 0, ifelse(Status_Change == 1, 0, lag(Duration) + 1)))
But this gives me an error:
Error: Problem with `mutate()` column `Duration`.
i `Duration = ifelse(Year == 1, 0, ifelse(Status_Change == 1, 0, lag(Duration) + 1))`.
x could not find function "Duration"
i The error occurred in group 1: Individual = "1"
I would greatly appreciate any help you can give me! Thanks in advance!