1

I want to create a new variable/column based on a combination of row values. I have over 70K ID's, and each ID has four rows (one for every year, 2013-2016). For each year, they get a value of "0" or "1". For 2013 only "0" is possible (for everyone) and for 2014-2016 they only can have all "0" OR all "1" (so two possible combinations: 0000 OR 0111; but in seperate rows).

I want to create a new variable that indicates in which group the ID belong. So, if ID has a combination of "0000" over the four years, I want to have a 0 for all years in that new column. And if a ID has a combination of "0111", I want to have a 1 for all years in that new column. That way, I can create a control and a treatment group for my analyses. My dataframe contains additional variables, e.g. gender.

structure(list(Year = c(2013, 2014, 2015, 2016, 2013, 2014, 2015, 
2016), Value = c(0, 0, 0, 0, 0, 1, 1, 1), ID = c(1, 1, 1, 1, 
2, 2, 2, 2), Gender = c(0, 0, 0, 0, 0, 0, 0, 0)), row.names = c(NA, 
-8L), class = c("tbl_df", "tbl", "data.frame"))

tibble [8 x 4] (S3: tbl_df/tbl/data.frame)
 $ Year  : num [1:8] 2013 2014 2015 2016 2013 ...
 $ Value : num [1:8] 0 0 0 0 0 1 1 1
 $ ID    : num [1:8] 1 1 1 1 2 2 2 2
 $ Gender: num [1:8] 0 0 0 0 0 0 0 0

I've already tried these codes, but I couldn't make them work on my dataframe. - How do I create a new column based on multiple conditions from multiple columns? - How to create new variable based on a combination of values in other variables

Hopefully somebody has some tips!

Thank you for you help!

1 Answers1

3

We can check for any 1s (binary converted to logical with as.logical) and coerce it back to binary with + or as.integer

library(dplyr)
df1 %>%
    group_by(ID) %>% 
    mutate(new = +any(as.logical(Value))) 
akrun
  • 874,273
  • 37
  • 540
  • 662