3

I'm relatively new to R but I am very familiar with Excel and T-SQL.

I have a simple dataset that has a date with time and a numeric value associated it. What I'd like to do is summarize the numeric values by-hour of the day. I've found a couple resources for working with time-types in R but I was hoping to find a solution similar to is offered excel (where I can call a function and pass-in my date/time data and have it return the hour of the day).

Any suggestions would be appreciated - thanks!

npjc
  • 4,134
  • 1
  • 22
  • 34
Sevyns
  • 2,992
  • 5
  • 19
  • 23
  • can you post starting data and intended output so you can be better helped. also see: http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – npjc Apr 27 '15 at 18:29
  • could you please post an example of your data. – miles2know Apr 27 '15 at 18:29
  • Thanks guys - sure can: Input DateTime | Value 3/14/2015 12:00:00 | 23 3/14/2015 13:00:00 | 24 3/15/2015 12:00:00 | 22 3/15/2015 13:00:00 | 40 Output Time | MeanValue 12:00:00 | 34 13:00:00 | 44 – Sevyns Apr 27 '15 at 18:35

1 Answers1

7
library(readr)
library(dplyr)
library(lubridate)


df <- read_delim('DateTime|Value
3/14/2015 12:00:00|23
3/14/2015 13:00:00|24
3/15/2015 12:00:00|22
3/15/2015 13:00:00|40',"|")

df %>% 
  mutate(hour_of_day = hour(as.POSIXct(strptime(DateTime, "%m/%d/%Y %H:%M:%S")))) %>% 
  group_by(hour_of_day) %>% 
  summarise(meanValue = mean(Value))

breakdown:

Convert column of DateTime (character) into formatted time then use hour() from lubridate to pull out just that hour value and put it into new column named hour_of_day.

> df %>% 
       mutate(hour_of_day = hour(as.POSIXct(strptime(DateTime, "%m/%d/%Y %H:%M:%S"))))
Source: local data frame [4 x 3]

            DateTime Value hour_of_day
1 3/14/2015 12:00:00    23          12
2 3/14/2015 13:00:00    24          13
3 3/15/2015 12:00:00    22          12
4 3/15/2015 13:00:00    40          13

The group_by(hour_of_day) sets the groups upon which mean(Value) is computed in the via the summarise(...) call.

this gives the result:

  hour_of_day meanValue
1          12      22.5
2          13      32.0
npjc
  • 4,134
  • 1
  • 22
  • 34