I have two data frames like the dummy samples , df1 (main data set) and df2. These data originally coming from two different log data.
df1 is always complete dataset from 02:00 midnight until 02:00 next day with different stream values during this period. But df2 only has value if there is an activity in the specific time period (between 02:00 and next day 02:00).
I was wondering how can I combine df1 and df2. My goal is to have similar complete dataset like df1 ( from 02:00 to next day 02:00) with additional variable - code - this variable can be empty if there is no activity/code in that start / end time in df2. And sometime the row in df1 needs to broken into 2 if there is an activity only in some part of that start / end time period
I am looking for the result similar to this. Showing only 4 rows just for example
id STREAM_1 STREAM_2 STREAM_3 START END
401 NIVH-ON IN null 2022-08-16 7:43:30 2022-08-16 7:45:00
401 INVH-ON OUT 12be4 2022-08-16 7:45:00 2022-08-16 7:49:00
401 INVH-ON OUT 12be4 2022-08-16 7:49:00 2022-08-16 8:15:00
df1:
df1 <- tibble::tribble(
~id, ~STREAM_1, ~STREAM_2, ~START, ~END,
401L, "NIVH-OFF", "IN", "2022-08-16 2:00:00", "2022-08-16 3:35:00",
401L, "NIVH-OFF", "IN", "2022-08-16 3:35:00", "2022-08-16 7:22:45",
401L, "NIVH-ON", "IN", "2022-08-16 7:22:45", "2022-08-16 7:31:45",
401L, "NIVH-DOCK", "IN", "2022-08-16 7:31:45", "2022-08-16 7:43:30",
401L, "NIVH-ON", "IN", "2022-08-16 7:43:30", "2022-08-16 7:45:00",
401L, "INVH-ON", "IN", "2022-08-16 7:45:00", "2022-08-16 7:49:00",
401L, "INVH-ON", "OUT", "2022-08-16 7:49:00", "2022-08-16 8:50:00",
401L, "INVH-ON", "IN", "2022-08-16 8:50:00", "2022-08-16 8:56:00",
401L, "NIVH-ON", "IN", "2022-08-16 8:56:00", "2022-08-16 8:58:00",
401L, "INVH-ON", "IN", "2022-08-16 8:58:00", "2022-08-16 9:00:00",
401L, "NIVH-ON", "IN", "2022-08-16 9:00:00", "2022-08-16 9:24:00",
401L, "INVH-ON", "IN", "2022-08-16 9:24:00", "2022-08-16 9:25:00",
401L, "NIVH-ON", "IN", "2022-08-16 9:25:00", "2022-08-16 9:55:00",
401L, "NIVH-ON", "OUT", "2022-08-16 9:55:00", "2022-08-16 10:20:00",
401L, "NIVH-ON", "IN", "2022-08-16 10:20:00", "2022-08-16 11:30:00",
401L, "NIVH-ON", "OUT", "2022-08-16 11:30:00", "2022-08-16 12:00:00",
401L, "NIVH-ON", "IN", "2022-08-16 12:00:00", "2022-08-16 12:03:00",
401L, "INVH-ON", "IN", "2022-08-16 12:03:00", "2022-08-16 12:06:00",
401L, "NIVH-ON", "IN", "2022-08-16 12:06:00", "2022-08-16 12:08:00",
401L, "INVH-ON", "IN", "2022-08-16 12:08:00", "2022-08-16 12:11:00",
401L, "NIVH-ON", "IN", "2022-08-16 12:11:00", "2022-08-16 12:33:00",
401L, "INVH-ON", "IN", "2022-08-16 12:33:00", "2022-08-16 12:34:00",
401L, "NIVH-ON", "IN", "2022-08-16 12:34:00", "2022-08-16 13:25:00",
401L, "NIVH-ON", "IN", "2022-08-16 13:25:00", "2022-08-16 13:35:00",
401L, "NIVH-ON", "IN", "2022-08-16 13:35:00", "2022-08-16 14:14:00",
401L, "NIVH-ON", "IN", "2022-08-16 14:14:00", "2022-08-16 14:15:00",
401L, "INVH-ON", "IN", "2022-08-16 14:15:00", "2022-08-16 14:19:00",
401L, "NIVH-ON", "IN", "2022-08-16 14:19:00", "2022-08-16 15:04:00",
401L, "NIVH-ON", "IN", "2022-08-16 15:04:00", "2022-08-16 15:09:00",
401L, "NIVH-ON", "IN", "2022-08-16 15:09:00", "2022-08-16 15:24:00",
401L, "NIVH-ON", "OUT", "2022-08-16 15:24:00", "2022-08-16 15:26:00",
401L, "INVH-ON", "OUT", "2022-08-16 15:26:00", "2022-08-16 16:04:00",
401L, "INVH-ON", "IN", "2022-08-16 16:04:00", "2022-08-16 16:07:00",
401L, "NIVH-ON", "IN", "2022-08-16 16:07:00", "2022-08-16 16:13:00",
401L, "INVH-ON", "IN", "2022-08-16 16:13:00", "2022-08-16 16:15:00",
401L, "NIVH-ON", "IN", "2022-08-16 16:15:00", "2022-08-16 16:20:15",
401L, "NIVH-DOCK", "IN", "2022-08-16 16:20:15", "2022-08-17 2:00:00"
)
df2:
df2 <- tibble::tribble(
~id, ~code, ~date, ~start_time, ~endtime,
401L, "12be4", 20220816L, "2022-08-16 07:45:15", "2022-08-16 08:15:15",
401L, "12be4", 20220816L, "2022-08-16 08:15:45", "2022-08-16 08:21:15",
401L, "12be4", 20220816L, "2022-08-16 08:21:30", "2022-08-16 08:40:15",
401L, "11e44", 20220816L, "2022-08-16 14:49:30", "2022-08-16 15:17:30",
401L, "12be4", 20220816L, "2022-08-16 15:24:30", "2022-08-16 15:33:15",
401L, "1fb84", 20220816L, "2022-08-16 15:33:15", "2022-08-16 15:43:45",
401L, "1fb84", 20220816L, "2022-08-16 15:46:30", "2022-08-16 15:50:30",
401L, "1b1c4", 20220816L, "2022-08-16 15:50:30", "2022-08-16 16:02:45"
)