I am trying to transform data into data2. I am looking for a base R or dplyr solution. There is an ID associated with each policy. There is a start date and an end date. These are all given. A policy year starts on the start date and ends one year later. A policy may go for several years. The first part of a policy needs to have a PolYr value of 0. When the policy year goes into the next year, PolYr takes the value 1. I was able to figure that out via
Numeric sequence with condition
For each contract, there is a row for each PolYr and CaldYr combination. I also need to determine the CaldYr. Looking at ID = 103, we see that the contract starts in 2011, its first row will have PolYr = 0 and CaldYr = 2011. The second part of PolYr 0 goes into 2012 so the second row for ID = 103 will have PolYr = 1 and CaldYr = 2012. This policy is more than 2 years in length and finishes in late 2013 so it goes for five rows.
Below are before and after data frames. I did some research, but did not find anything that I perceived as corresponding to my problem.
library(dplyr)
ID = c(101, rep(102, 2), rep(103,5))
start = as.Date(c('2/1/2010', rep('5/17/2011', 2), rep('5/17/2011', 5)), '%m/%d/%Y')
end = as.Date(c('3/5/2010', rep('1/4/2012', 2 ), rep('8/4/2013', 5 )), '%m/%d/%Y')
data = data.frame(ID = ID, start = start, end = end)
v = c(0,1)
data = data %>% group_by(ID) %>% mutate(PolYr = rep_len(v, length(ID)))
data
data2 = data
data2$CaldrYr = c(2010, 2011, 2012, 2011, 2012, 2012, 2013, 2013)
data2