1

[enter image description here][1][enter image description here][2]I have a data frame "RH", with hourly data and I want to convert it to daily maximum and minimum data. This code was very useful [question]:Aggregating hourly data into daily aggregates

RH$Date <- strptime(RH$Date,format="%y/%m/%d)
RH$day <- trunc(RH$Date,"day")

require(plyr)

x <- ddply(RH,.(Date),
  summarize,
  aveRH=mean(RH),
  maxRH=max(RH),
  minRH=min(RH)
)

But my first 5 years data are 3 hours data not hourly. so no results for those years. Any suggestion? Thank you in advance.

'data.frame': 201600 obs. of 3 variables: $ Date: chr "1985/01/01" "1985/01/01" "1985/01/01" "1985/01/01" ... $ Hour: int 1 2 3 4 5 6 7 8 9 10 ... $ RH : int NA NA 93 NA NA NA NA NA 79 NA ...

  • So you do not want the first five year show up in the result or you need them show up but all the value should be `NA ` – BENY Jul 03 '17 at 00:49
  • The link you provided is old. The code should still work, but there are more recent approaches. We can't fix the problem because you did not provide data. Would you like an example of recent code that should work with your data? – Pierre Lapointe Jul 03 '17 at 00:51
  • @PLapointe same here, I just found the question was from 6 year ago ...:( – BENY Jul 03 '17 at 00:53
  • thanks for your answers. I need to have all years maximum and minimum data. but this code will show NA for the first 5 years which is not hourly but 3 hours data. I don't know how to convert for my first 5 years with 3hours data and 20 years with hourly data. my data period is 25. – user3575805 Jul 03 '17 at 00:58
  • @user3575805 I added a "modern" version of what you are trying to do. It should work with your data. – Pierre Lapointe Jul 03 '17 at 01:01

1 Answers1

1

The link you provided is an old one. The code is still perfectly good and would work, but here's a more modern version using dplyr and lubridate

df <- read.table(text='date_time value
"01/01/2000 01:00" 30
"01/01/2000 02:00" 31
"01/01/2000 03:00" 33
"12/31/2000 23:00" 25',header=TRUE,stringsAsFactors=FALSE)

library(dplyr);library(lubridate)
df %>%
  mutate(date_time=as.POSIXct(date_time,format="%m/%d/%Y %H:%M")) %>%
  group_by(date(date_time)) %>%
  summarise(mean=mean(value,na.rm=TRUE),max=max(value,na.rm=TRUE),
            min=min(value,na.rm=TRUE))

  `date(date_time)`     mean   max   min
             <date>    <dbl> <dbl> <dbl>
1        2000-01-01 31.33333    33    30
2        2000-12-31 25.00000    25    25

EDIT Since there's already a date column, this should work:

RH %>% 
 group_by(Date) %>% 
 summarise(mean=mean(RH,na.rm=TRUE),max=max(RH,na.rm=TRUE), 
           min=min(RH,na.rm=TRUE))
Pierre Lapointe
  • 16,017
  • 2
  • 43
  • 56
  • thanks. My data frame has 3 columns, Date , hour and RH.I used your new code but did not work for me.`library(dplyr);library(lubridate) RH %>% mutate(Date=as.POSIXct(Date,format="%y/%m/%d")) %>% group_by(date(Date)) %>% summarise(mean=mean(RH,na.rm=TRUE),max=max(RH,na.rm=TRUE), min=min(RH,na.rm=TRUE))` but by changing the old code and adding "na.rm=TRUE", it worked. – user3575805 Jul 03 '17 at 01:31
  • @user3575805 See my edit. This supposes that your data.frame is call `RH` and that you have a `RH` column. – Pierre Lapointe Jul 03 '17 at 01:36
  • I tried new one but only showed first row with incorrect value as max and min. result is `mean max min 1 67.15332 100 10` – user3575805 Jul 03 '17 at 01:43
  • @user3575805 Can you add the result of `str(RH)` in your question? – Pierre Lapointe Jul 03 '17 at 01:45
  • `'data.frame': 201600 obs. of 3 variables: $ Date: chr "1985/01/01" "1985/01/01" "1985/01/01" "1985/01/01" ... $ Hour: int 1 2 3 4 5 6 7 8 9 10 ... $ RH : int NA NA 93 NA NA NA NA NA 79 NA ...` – user3575805 Jul 03 '17 at 01:47
  • @user3575805 My solution should work. Why are the results wrong? – Pierre Lapointe Jul 03 '17 at 01:52