1

I wish to filter a df by a time interval in another df.

Each df has an id variable, and the time intervals are unique to the id.

I have two data frames, both have a common 'id' variable. The first df has the start and stop time, and the second has the data that I wish to trim by the start and stop time for each id.

df_1:

id             start                    stop
<dbl>          <dttm>                  <dttm>
12        2018-04-10 12:00:00     2018-04-10 18:00:00
18        2018-02-01 04:00:00     2018-02-01 09:00:00


df_1 <- structure(list(id = c(12, 18), 
start = structure(c(1523361600, 1517457600), 
class = c("POSIXct", "POSIXt"), tzone = "UTC"), 
stop = structure(c(1523383200, 1517475600), 
class = c("POSIXct", "POSIXt"), tzone = "UTC")), 
.Names = c("id", "time_on", "time_off"), 
class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -2L))

df_2:

id  timestamp
12  10/04/2018 11:00
12  10/04/2018 12:00
12  10/04/2018 13:00
12  10/04/2018 14:00
12  10/04/2018 15:00
12  10/04/2018 16:00
12  10/04/2018 17:00
12  10/04/2018 18:00
12  10/04/2018 19:00
12  10/04/2018 20:00
18  01/02/2018 01:00
18  01/02/2018 02:00
18  01/02/2018 03:00
18  01/02/2018 04:00
18  01/02/2018 05:00
18  01/02/2018 06:00
18  01/02/2018 07:00
18  01/02/2018 08:00
18  01/02/2018 09:00
18  01/02/2018 10:00


df_2 <- structure(list(id = c(12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 
18, 18, 18, 18, 18, 18, 18, 18, 18, 18), 
timestamp = structure(c(1523358000, 1523361600, 1523365200, 
1523368800, 1523372400, 1523376000, 1523379600, 1523383200, 
1523386800, 1523390400, 1517446800, 1517450400, 1517454000.005,
1517457600.01, 1517461200.015, 1517464800.02, 1517468400.025, 
1517472000.03, 1517475600.035, 1517479200.04), 
class = c("POSIXct", "POSIXt"), tzone = "UTC")), 
.Names = c("id", "timestamp"), 
class = c("tbl_df", "tbl", "data.frame"), 
row.names = c(NA, -20L))

using the lubridate package I have created 'interval' variable:

df_1 <- mutate(df_1, 
               interval = (interval(start, stop)))

do I need to use a loop to crop df_2 by the intervals unique to the 'id' variable? Possibly using %within% ?

I would like to end up with:

df_2:

12  10/04/2018 12:00
12  10/04/2018 13:00
12  10/04/2018 14:00
12  10/04/2018 15:00
12  10/04/2018 16:00
12  10/04/2018 17:00
12  10/04/2018 18:00
18  01/02/2018 04:00
18  01/02/2018 05:00
18  01/02/2018 06:00
18  01/02/2018 07:00
18  01/02/2018 08:00
18  01/02/2018 09:00
Sotos
  • 51,121
  • 6
  • 32
  • 66
Pete
  • 600
  • 1
  • 6
  • 16

0 Answers0