There is the built-in lubridate::int_overlaps
but that only returns a logical, not how long they overlap. Luckily, the intersection
function has a method for Interval
objects. The only trick is that if there's no overlap, it returns length-NA
, not length-0
. So we can wrap up that logic like this:
library(lubridate)
int_overlaps_numeric <- function (int1, int2) {
stopifnot(c(is.interval(int1), is.interval(int2)))
x <- intersect(int1, int2)@.Data
x[is.na(x)] <- 0
as.duration(x)
}
This constructs the interval that is the overlap, then extracts the length of it (in seconds). If it's NA
, change it to zero, and return. as.duration
just gives us pretty-printing. Now you just have to give it two intervals:
int1 <- as.interval(5, Sys.time())
int2 <- as.interval(5, Sys.time()+3)
int_overlaps_numeric(int1, int2)
"1.99299597740173s"
So you need to get all your holidays into intervals, and all your shifts into intervals. Presumably you want to associate these overlaps with other data in the shift_time
dataframe, so we'll use dplyr
to do all our work inside there. However, you want to check each shift against a vector of all holidays, so we should add another helper function (using purrr::map
).
library(dplyr)
library(purrr)
check_shift_against_holidays <- function(shift, holidays) {
map(shift, ~sum(int_overlaps_numeric(.x, holidays))) %>%
unlist() %>%
as.duration()
}
This function takes two vectors of intervals. For each element of the first vector, it counts overlaps with every element of the second vector, then adds them up. Then turn it from a list back into a vector, and reclass it as a duration
for pretty-printing. The caveat here is that if there are any overlaps in the holidays
vector, those hours will be double-counted.
# days(1) since the holiday lasts all day
holiday_intervals <- as.interval(days(1), ymd(public_holidays$date))
shift_time %>%
mutate(
shift = interval(ymd_hms(started_at), ymd_hms(ended_at)),
holiday_hours = check_shift_against_holidays(shift, holiday_intervals)
)
started_at ended_at shift holiday_hours
1 2019-09-01 02:00:00 AEST 2019-09-01 11:30:00 AEST 2019-09-01 02:00:00 UTC--2019-09-01 11:30:00 UTC 0s
2 2019-09-02 05:00:00 AEST 2019-09-02 19:00:00 AEST 2019-09-02 05:00:00 UTC--2019-09-02 19:00:00 UTC 0s
3 2019-11-04 20:00:00 AEDT 2019-11-05 04:00:00 AEDT 2019-11-04 20:00:00 UTC--2019-11-05 04:00:00 UTC 14400s (~4 hours)
And if you're really opposed to creating any new intermediate variables:
shift_time %>%
mutate(
holiday_hours = check_shift_against_holidays(
ymd_hms(started_at) %--% ymd_hms(ended_at),
holiday_intervals
)
)
started_at ended_at holiday_hours
1 2019-09-01 02:00:00 AEST 2019-09-01 11:30:00 AEST 0s
2 2019-09-02 05:00:00 AEST 2019-09-02 19:00:00 AEST 0s
3 2019-11-04 20:00:00 AEDT 2019-11-05 04:00:00 AEDT 14400s (~4 hours)