Here is a solution which should generalize to multiple order id's. I have created a sample data with two order id's. The basic idea is to calculate the number of intervals between start_date
and end_date
. Then we repeat the row for each order id by the number of intervals, and also create a sequence to determine which interval we are in. This is the purpose of creating functions f
and g
and the use of Map
.
The remaining is just vector manipulations where we define split_start_date
and split_end_date
. The last statement is to ensure that split_end_date
does not exceed end_date
.
df <- data.frame(
order_id = c(1, 2),
start_date = c(as.Date("2017-05-01"), as.Date("2017-08-01")),
end_date = c(as.Date("2017-07-06"), as.Date("2017-09-15"))
)
df$diff_days <- as.integer(df$end_date - df$start_date)
df$num_int <- ceiling(df$diff_days / 30)
f <- function(rowindex) {
rep(rowindex, each = df[rowindex, "num_int"])
}
g <- function(rowindex) {
1:df[rowindex, "num_int"]
}
rowindex_rep <- unlist(Map(f, 1:nrow(df)))
df2 <- df[rowindex_rep, ]
df2$seq <- unlist(Map(g, 1:nrow(df)))
df3 <- df2
df3$split_start_date <- df3$start_date + (df3$seq - 1) * 30
df3$split_end_date <- df3$split_start_date + 29
df3[which(df3$seq == df3$num_int), ]$split_end_date <-
df3[which(df3$seq == df3$num_int), ]$end_date