I have the following data frame,
Leaflet start_date end_date id
1: 1 2022-01-05 2022-01-08 15
2: 1 2022-01-05 2022-01-11 14
3: 1 2022-01-09 2022-01-11 10
4: 10 2022-03-09 2022-03-12 21
5: 10 2022-03-09 2022-03-15 18
6: 10 2022-03-13 2022-03-15 32
7: 11 2022-03-16 2022-03-19 30
8: 11 2022-03-16 2022-03-22 4
9: 11 2022-03-20 2022-03-22 39
As you can see there are overlapping intervals. What I would like to do is to create a new_start_date_promo where it will roll the end date to the start date If the intervals overlap. In the end I want to get the following:
Leaflet start_date_promo end_date_promo id new_start_promo
1: 1 2022-01-05 2022-01-08 15 2022-01-05
2: 1 2022-01-05 2022-01-11 14 2022-01-08
3: 1 2022-01-09 2022-01-11 10 2022-01-09
4: 10 2022-03-09 2022-03-12 21 2022-03-09
5: 10 2022-03-09 2022-03-15 18 2022-03-12
6: 10 2022-03-13 2022-03-15 32 2022-03-13
7: 11 2022-03-16 2022-03-19 30 2022-03-16
8: 11 2022-03-16 2022-03-22 4 2022-03-19
9: 11 2022-03-20 2022-03-22 39 2022-03-20
I tried this with dplyr
but it looks very convoluted, messes up the date formats and I am not sure if it will follow my methodology. Nevertheless, this was my attempt:
library(dplyr)
leaflet_dim %>%
group_by(Leaflet) %>%
arrange(start_date, end_date) %>%
mutate(new_start = lag(ifelse(start_date == lead(start_date), end_date_promo, lead(start_date)))) %>%
mutate(new_start = anytime::anytime(ifelse(is.na(new_start),start_date_promo, new_start)), new_start = new_start - (24*60*60))
DATA
structure(list(Leaflet = c("1", "1", "1",
"10", "10", "10", "11", "11",
"11"), start_date_promo = structure(c(18997, 18997, 19001,
19060, 19060, 19064, 19067, 19067, 19071), class = "Date"), end_date_promo = structure(c(19000,
19003, 19003, 19063, 19066, 19066, 19070, 19073, 19073), class = "Date"),
id = c(15L, 14L, 10L, 21L, 18L, 32L, 30L, 4L, 39L)), row.names = c(NA,
-9L), class = c("data.table", "data.frame"))