data.table solution
I would tackle this problem using the foverlaps() function from data.table... The only problem is that is only accepts complete date-ranges, and in the sample-data provided ranges[,5] has no enddate...
> ranges
id start end
1 AED 2018-05-17 10:00:00 2018-05-17 11:56:00
2 CFR 2018-05-17 10:18:00 2018-05-17 12:23:00
3 DRR 2018-05-17 11:18:00 2018-05-17 12:01:00
4 DRR 2018-05-17 13:10:00 2018-05-17 14:18:00
5 UN 2018-05-17 14:18:00 <NA>
In order for the following solution to word, all ranges have to have a start AND an end.
So, let's fill in the NA using some make up timestamp.
ranges <- data.frame(id = c("AED","CFR","DRR","DRR","UN"),
start = as.POSIXct(c("2018-05-17 10:00:00","2018-05-17 10:18:00","2018-05-17 11:18:00","2018-05-17 13:10:00","2018-05-17 14:18:00")),
end = as.POSIXct(c("2018-05-17 11:56:00","2018-05-17 12:23:00","2018-05-17 12:01:00","2018-05-17 14:18:00", "2018-05-17 16:18:00")))
> ranges
id start end
1 AED 2018-05-17 10:00:00 2018-05-17 11:56:00
2 CFR 2018-05-17 10:18:00 2018-05-17 12:23:00
3 DRR 2018-05-17 11:18:00 2018-05-17 12:01:00
4 DRR 2018-05-17 13:10:00 2018-05-17 14:18:00
5 UN 2018-05-17 14:18:00 2018-05-17 16:18:00
Workflow
library(data.table)
#make instances a data.table without key
instances.dt <- setDT( instances, key = NULL )
#create a data.table with the ranges, set keys
ranges.dt <- setDT( ranges, key = c("id", "start", "end") )
#create a temporary 'range', where start == end, based on the dates-column
instances.dt[, c( "start", "end") := dates]
#create a column 'inRange' using data.table's foverlaps().
#use the secons column of the fovelaps' result. If this column is NA, then no 'hit' was found
#in ranges.dt and inrange == FALSE, else inRange == TRUE
instances.dt[, inRange := !is.na( foverlaps(instances.dt, ranges.dt, type = "within", mult = "first", nomatch = NA)[,2] )]
#outsideRange is the opposite of inRange
instances.dt[, outsideRange := !inRange]
#remove the temporary columns 'start' and 'end'
instances.dt[, c("start", "end") := NULL]
Result
> instances.dt
id dates inRange outsideRange
1: AED 2018-05-17 09:52:00 FALSE TRUE
2: AED 2018-05-17 10:49:00 TRUE FALSE
3: CFR 2018-05-17 10:38:00 TRUE FALSE
4: DRR 2018-05-17 11:29:00 TRUE FALSE
5: DRR 2018-05-17 12:12:00 FALSE TRUE
6: DRR 2018-05-17 13:20:00 TRUE FALSE
7: UN 2018-05-17 14:28:00 TRUE FALSE
8: PO 2018-05-17 15:59:00 FALSE TRUE
This works blazingly fast, even for huge data.tables.
You can shorten the code, but I always like to do the analysis one step at a time, improving readability.
Chained using magrittr's pipe-operator
library(data.table)
library(magrittr)
ranges.dt <- setDT( ranges, key = c("id", "start", "end") )
result <- setDT( instances, key = NULL ) %>%
.[, c( "start", "end") := dates] %>%
.[, inRange := !is.na( foverlaps( ., ranges.dt, type = "within", mult = "first", nomatch = NA )[,2] )] %>%
.[, outsideRange := !inRange] %>%
.[, c("start", "end") := NULL]