0

How do I generate a dummy variable which is zero before year and takes the value 1 from year and onwards to 2019? Original data:

structure(list(id = c(1, 2, 3, 4, 5, 6, 7, 8), Year = c(2017, 
2015, 2018, 2018, 2018, 2018, 2018, 2018)), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -8L))

what I need:


structure(list(id = c(1, 2, 3, 4, 5, 6, 7, 8), Year = c(2017, 
2015, 2018, 2018, 2018, 2018, 2018, 2018), `2015` = c(NA, 1, 
NA, NA, NA, NA, NA, NA), `2016` = c(NA, 1, NA, NA, NA, NA, NA, 
NA), `2017` = c(1, 1, NA, NA, NA, NA, NA, NA), `2018` = c(1, 
1, 1, 1, 1, 1, 1, 1), `2019` = c(1, 1, 1, 1, 1, 1, 1, 1)), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -8L))

janeluyip
  • 17
  • 5

1 Answers1

0

split on id, extend year range i:2019, then reshape from long-to-wide:

res <- reshape(stack(sapply(split(df2$Year, df2$id), function(i) i:2019)),
               timevar = "values", v.names = "values", idvar = "ind", 
               direction = "wide")

# fix the column names order
res <- res[ sort(colnames(res)) ]

res
#    ind values.2015 values.2016 values.2017 values.2018 values.2019
# 1    1          NA          NA        2017        2018        2019
# 4    2        2015        2016        2017        2018        2019
# 9    3          NA          NA          NA        2018        2019
# 11   4          NA          NA          NA        2018        2019
# 13   5          NA          NA          NA        2018        2019
# 15   6          NA          NA          NA        2018        2019
# 17   7          NA          NA          NA        2018        2019
# 19   8          NA          NA          NA        2018        2019
zx8754
  • 52,746
  • 12
  • 114
  • 209