I'm trying to resample a dataset of hourly Ozone measurements from this source - https://aqs.epa.gov/aqsweb/airdata/hourly_44201_2016.zip
Here is the head of the data:
structure(list(date_time = structure(c(1456844400, 1456848000,
1456851600, 1456855200, 1456858800, 1456862400, 1456866000, 1456869600,
1456873200, 1456880400, 1456884000, 1456887600, 1456891200, 1456894800,
1456898400, 1456902000, 1456905600, 1456912800, 1456916400, 1456920000,
1456923600, 1456927200, 1456930800, 1456934400, 1456938000, 1456941600,
1456945200, 1456948800, 1456952400, 1456956000), class = c("POSIXct",
"POSIXt"), tzone = "UTC"), Sample.Measurement = c(0.041, 0.041,
0.042, 0.041, 0.038, 0.038, 0.036, 0.035, 0.029, 0.026, 0.03,
0.03, 0.028, 0.027, 0.025, 0.023, 0.025, 0.034, 0.036, 0.038,
0.041, 0.042, 0.043, 0.043, 0.041, 0.033, 0.01, 0.01, 0.011,
0.007)), .Names = c("date_time", "Sample.Measurement"), row.names = c(NA,
30L), class = "data.frame")
I've combined the local date and time columns to create a datetime using Lubridate:
df$date_time = ymd_hm(paste(df$Date.Local, df$Time.Local))
What I then want to do is resample the Sample.Measurement data into an eight-hour rolling mean. From there I want to then select the max value for each day.
In Pandas, this would be trivial using the resample() method.
How do I do this in R - Dplyr?