I am trying to reproduce on my dataframe a DiD analysis performed by Callaway and Sant'Anna (2021). Having a variation in treatment timing, I need to define a variable "first-treat" reporting for each ID the year when they first became treated (treatment = 0 if not treated, 1 otherwise). In case the units are never treated, the value of first.treat will be zero. I report below a simplified dataframe: I have the variables ID, Year, and Treatment. I need to create the variable first.treat as follows.
ID | Year | Treatment | first.treat |
---|---|---|---|
a | 2016 | 0 | 2017 |
a | 2017 | 1 | 2017 |
a | 2018 | 1 | 2017 |
b | 2016 | 1 | 2016 |
b | 2017 | 1 | 2016 |
b | 2018 | 1 | 2016 |
c | 2016 | 0 | 2018 |
c | 2017 | 0 | 2018 |
c | 2018 | 1 | 2018 |
d | 2016 | 0 | 0 |
d | 2017 | 0 | 0 |
d | 2018 | 0 | 0 |
How can I do it with R? Thank you