-2

I have a data frame, of which one column contains time data (YYYY-MM-DD hh:mm:ss) and another contains staff availability figures (the number of staff members available at a specific time).

I want to count the number available staff members where hour is the same (between 00 - 23) and the minute value is between 00 - 59.

Consider this sample data frame (df):

TIME                   STAFF_AVAIL
2018-01-01 06:58:53    05
2018-01-01 07:00:00    10
2018-01-01 07:04:45    20
2018-01-01 07:04:55    25
2018-01-01 07:21:00    20
2018-01-01 07:58:04    18
2018-01-01 07:59:59    12
2018-01-01 08:00:00    17
2018-01-01 08:01:04    30

I want to count the number of staff available for every hour up until one second (inclusive) before the next hour begins.

Using df as an example, I want to count the number of staff available where the hour value is equal to 7 and where the respective minutes are between 00-59 (inclusive).

In this example, the number of staff available between 7:00:00 and 08:00:00 is 105 (10 + 20 + 25 + 20 + 18 + 12).

How can I achieve this?

Mus
  • 7,290
  • 24
  • 86
  • 130

1 Answers1

0

Here is a lubridate plus dplyr solution:

library(lubridate);
library(dplyr);
df %>%
    mutate(
        TIME = ymd_hms(TIME),
        TIME.hr.bin = floor_date(TIME, unit = "hour")) %>%
    group_by(TIME.hr.bin) %>%
    summarise(n = sum(STAFF_AVAIL));
#  TIME.hr.bin             n
#  <dttm>              <int>
#1 2018-01-01 06:00:00     5
#2 2018-01-01 07:00:00   105
#3 2018-01-01 08:00:00    47

Sample data

df <- read.table(text =
    "TIME                   STAFF_AVAIL
'2018-01-01 06:58:53'    05
'2018-01-01 07:00:00'    10
'2018-01-01 07:04:45'    20
'2018-01-01 07:04:55'    25
'2018-01-01 07:21:00'    20
'2018-01-01 07:58:04'    18
'2018-01-01 07:59:59'    12
'2018-01-01 08:00:00'    17
'2018-01-01 08:01:04'    30", header = T)
Maurits Evers
  • 49,617
  • 4
  • 47
  • 68