Count values based on criteria

Question

I have a data frame, of which one column contains time data (YYYY-MM-DD hh:mm:ss) and another contains staff availability figures (the number of staff members available at a specific time).

I want to count the number available staff members where hour is the same (between 00 - 23) and the minute value is between 00 - 59.

Consider this sample data frame (df):

TIME                   STAFF_AVAIL
2018-01-01 06:58:53    05
2018-01-01 07:00:00    10
2018-01-01 07:04:45    20
2018-01-01 07:04:55    25
2018-01-01 07:21:00    20
2018-01-01 07:58:04    18
2018-01-01 07:59:59    12
2018-01-01 08:00:00    17
2018-01-01 08:01:04    30

I want to count the number of staff available for every hour up until one second (inclusive) before the next hour begins.

Using df as an example, I want to count the number of staff available where the hour value is equal to 7 and where the respective minutes are between 00-59 (inclusive).

In this example, the number of staff available between 7:00:00 and 08:00:00 is 105 (10 + 20 + 25 + 20 + 18 + 12).

How can I achieve this?

base R: `cut.Date` + `aggregate` – Ben Bolker Mar 19 '18 at 12:27 — Ben Bolker, Mar 19 '18 at 12:27

score 0 · Answer 1 · answered Mar 19 '18 at 12:39

Here is a lubridate plus dplyr solution:

library(lubridate);
library(dplyr);
df %>%
    mutate(
        TIME = ymd_hms(TIME),
        TIME.hr.bin = floor_date(TIME, unit = "hour")) %>%
    group_by(TIME.hr.bin) %>%
    summarise(n = sum(STAFF_AVAIL));
#  TIME.hr.bin             n
#  <dttm>              <int>
#1 2018-01-01 06:00:00     5
#2 2018-01-01 07:00:00   105
#3 2018-01-01 08:00:00    47

Sample data

df <- read.table(text =
    "TIME                   STAFF_AVAIL
'2018-01-01 06:58:53'    05
'2018-01-01 07:00:00'    10
'2018-01-01 07:04:45'    20
'2018-01-01 07:04:55'    25
'2018-01-01 07:21:00'    20
'2018-01-01 07:58:04'    18
'2018-01-01 07:59:59'    12
'2018-01-01 08:00:00'    17
'2018-01-01 08:01:04'    30", header = T)

Count values based on criteria

1 Answers1

Sample data

Linked