Yesterday I asked a question. Complex sequence based on a condition
Thank you to those who helped me solve it. My minimal example was
library(dplyr)
ID = c(101, rep(102, 2), rep(103,5))
start = as.Date(c('2/1/2010', rep('5/17/2011', 2), rep('5/17/2011', 5)), '%m/%d/%Y')
end = as.Date(c('3/5/2010', rep('1/4/2012', 2 ), rep('8/4/2013', 5 )), '%m/%d/%Y')
data = data.frame(ID = ID, start = start, end = end)
v = c(0,1)
data = data %>% group_by(ID) %>% mutate(PolYr = rep_len(v, length(ID)))
data
Now I am hoping someone can help me with this part of the code.
v = c(0,1)
data = data %>% group_by(ID) %>% mutate(PolYr = rep_len(v, length(ID)))
The code runs. However, on my real data with more than 2 million rows of data and hundreds of thousands of ID, the elapsed time was 2297.74. I am hoping someone can suggest a faster method, perhaps with data.table, which I am just trying to learn. The goal is for each ID to start PolYr with a 0 and then continue with a 1 (if there is a second row) and then back to 0, 1, …