0

I have a dataset with many rows of multiple dates, and for each date, data associated with multiple times for that day. I want to filter down to one particular time of day (the same time each day, which is stock market open, EST) for each day in the dataset.

Example Problem:

    dat1 <- tibble(
      DateTime = rep(as.POSIXct(c("2019-02-21 09:30:00", 
                                  "2019-02-21 16:30:00",
                                  "2019-02-22 09:30:00", 
                                  "2019-02-22 16:30:00",
                                  "2019-02-25 09:30:00", 
                                  "2019-02-25 16:30:00")), each=3),
      Ticker = c("AAPL", "AAPL", "AAPL", "AAPL", "AAPL", "AAPL", 
                 "ABC", "ABC", "ABC", "ABC", "ABC", "ABC", 
                 "ADBE", "ADBE", "ADBE", "ADBE", "ADBE", "ADBE"),
      Thirtymin = c(11000000,12000000,11500000,12500000,13000000,13400000,
                    30000,35000,33000,80000,280000,250000,
                    2000,2100,2200,2340,2300,2350),
      Tgt_Shares = c(2500,2600,2400,2350,2600,2700,
                     1000,2000,3000,2000,1500,2000,
                     100,350,600,450,390,475)
    )

I could, of course, use the following...

    dat2 <- dat1 %>%
      filter (
          DateTime ==  "2019-02-21 09:30:00" |
          DateTime == "2019-02-22 09:30:00" |
          DateTime == "2019-02-25 09:30:00" 
    )

...but the actual dataset is too big for this approach -- though in this simple case, it is the desired results.

I see lots of posts on date/times, of course, but many are not in respect to dplyr or even R, and I haven't seen any that filter a particular time for each of several trading (non-consecutive) days.

I'm sure there is a simple answer, but I'm missing it!

W Barker
  • 324
  • 2
  • 8

3 Answers3

2

Using format for hour %H, minute %M and second %S, or the equivalent shortcut %T for time.

library(dplyr)

dat1 %>% 
  filter(format(DateTime, "%H:%M:%S") == "09:30:00")
# A tibble: 9 × 4
  DateTime            Ticker Thirtymin Tgt_Shares
  <dttm>              <chr>      <dbl>      <dbl>
1 2019-02-21 09:30:00 AAPL    11000000       2500
2 2019-02-21 09:30:00 AAPL    12000000       2600
3 2019-02-21 09:30:00 AAPL    11500000       2400
4 2019-02-22 09:30:00 ABC        30000       1000
5 2019-02-22 09:30:00 ABC        35000       2000
6 2019-02-22 09:30:00 ABC        33000       3000
7 2019-02-25 09:30:00 ADBE        2000        100
8 2019-02-25 09:30:00 ADBE        2100        350
9 2019-02-25 09:30:00 ADBE        2200        600
Andre Wildberg
  • 12,344
  • 3
  • 12
  • 29
1
dat1 %>%
  mutate(hour_dec = lubridate::hour(DateTime) + lubridate::minute(DateTime)/60) %>%
  filter(near(hour_dec, 9.5))

I'm using near here in case there could be floating point issues.

Result

# A tibble: 9 × 5
  DateTime            Ticker Thirtymin Tgt_Shares hour_dec
  <dttm>              <chr>      <dbl>      <dbl>    <dbl>
1 2019-02-21 09:30:00 AAPL    11000000       2500      9.5
2 2019-02-21 09:30:00 AAPL    12000000       2600      9.5
3 2019-02-21 09:30:00 AAPL    11500000       2400      9.5
4 2019-02-22 09:30:00 ABC        30000       1000      9.5
5 2019-02-22 09:30:00 ABC        35000       2000      9.5
6 2019-02-22 09:30:00 ABC        33000       3000      9.5
7 2019-02-25 09:30:00 ADBE        2000        100      9.5
8 2019-02-25 09:30:00 ADBE        2100        350      9.5
9 2019-02-25 09:30:00 ADBE        2200        600      9.5
Jon Spring
  • 55,165
  • 4
  • 35
  • 53
1

with hms library:

library(dplyr)
library(hms)

dat1 |>
  filter(as_hms(DateTime) == parse_hms("09:30:00"))

# A tibble: 9 × 4
  DateTime            Ticker Thirtymin Tgt_Shares
  <dttm>              <chr>      <dbl>      <dbl>
1 2019-02-21 09:30:00 AAPL    11000000       2500
2 2019-02-21 09:30:00 AAPL    12000000       2600
3 2019-02-21 09:30:00 AAPL    11500000       2400
4 2019-02-22 09:30:00 ABC        30000       1000
5 2019-02-22 09:30:00 ABC        35000       2000
6 2019-02-22 09:30:00 ABC        33000       3000
7 2019-02-25 09:30:00 ADBE        2000        100
8 2019-02-25 09:30:00 ADBE        2100        350
9 2019-02-25 09:30:00 ADBE        2200        600
MarBlo
  • 4,195
  • 1
  • 13
  • 27