I have two tables to join based on date-time fields. To recreate the scenario let us take the commercial vs sale example. We want to know which sale is related to which commercial.
Sale can only be tagged to the last commercial and only if it occurred after the commercial.
Also, if a sale happened after multiple commercials, we can only tag the sale to the last commercial; prior commercials will have a null in the join.
I am unable to get this last part. If there is a sale after multiple commercials, then all such commercials are joined with that sale; which I don't want. In my example, the sale that happened at "2017-01-01 02:05:00" should join with the commercial that aired at "2017-01-01 02:00:00" and not the prior commercials.
library(lubridate)
library(data.table)
ts <- seq(as.POSIXct("2017-01-01", tz = "UTC"),
as.POSIXct("2017-01-02", tz = "UTC"),
by = "30 min")
commercial <-
data.table(
c_row_number = 1:10,
c_time = ts[1:10],
c_time_roll = ts[1:10]
)
sale <-
data.table(
s_row_number = 1:4,
s_time = ts[5:8] + minutes(5),
s_time_roll = ts[5:8] + minutes(5)
)
setkey(commercial, c_time_roll)
setkey(sale, s_time_roll)
tbl_joined <- sale[commercial, roll = -Inf] # , mult = 'last']
Any idea how can we get NAs where c_row_number is 1, 2, 3, and 4. Thank you.