I want to create a new variable/column based on a combination of row values. I have over 70K ID's, and each ID has four rows (one for every year, 2013-2016). For each year, they get a value of "0" or "1". For 2013 only "0" is possible (for everyone) and for 2014-2016 they only can have all "0" OR all "1" (so two possible combinations: 0000 OR 0111; but in seperate rows).
I want to create a new variable that indicates in which group the ID belong. So, if ID has a combination of "0000" over the four years, I want to have a 0 for all years in that new column. And if a ID has a combination of "0111", I want to have a 1 for all years in that new column. That way, I can create a control and a treatment group for my analyses. My dataframe contains additional variables, e.g. gender.
structure(list(Year = c(2013, 2014, 2015, 2016, 2013, 2014, 2015,
2016), Value = c(0, 0, 0, 0, 0, 1, 1, 1), ID = c(1, 1, 1, 1,
2, 2, 2, 2), Gender = c(0, 0, 0, 0, 0, 0, 0, 0)), row.names = c(NA,
-8L), class = c("tbl_df", "tbl", "data.frame"))
tibble [8 x 4] (S3: tbl_df/tbl/data.frame)
$ Year : num [1:8] 2013 2014 2015 2016 2013 ...
$ Value : num [1:8] 0 0 0 0 0 1 1 1
$ ID : num [1:8] 1 1 1 1 2 2 2 2
$ Gender: num [1:8] 0 0 0 0 0 0 0 0
I've already tried these codes, but I couldn't make them work on my dataframe. - How do I create a new column based on multiple conditions from multiple columns? - How to create new variable based on a combination of values in other variables
Hopefully somebody has some tips!
Thank you for you help!