4

I have hourly rainfall and temperature data for long period. I would like to get daily values from hourly data. I am considering day means from 07:00:00 to next day 07:00:00.

Could you tell me how to convert hourly data to daily between specific time interval?

example : 07:00:00 to 07:00:00 or 12:00:00 to 12:00:00)

Rainfall data looks like:

1970-01-05 00:00:00      1.0 
1970-01-05 01:00:00      1.0
1970-01-05 02:00:00      1.0
1970-01-05 03:00:00      1.0
1970-01-05 04:00:00      1.0
1970-01-05 05:00:00      3.6
1970-01-05 06:00:00      3.6
1970-01-05 07:00:00      2.2
1970-01-05 08:00:00      2.2
1970-01-05 09:00:00      2.2
1970-01-05 10:00:00      2.2
1970-01-05 11:00:00      2.2
1970-01-05 12:00:00      2.2
1970-01-05 13:00:00      2.2
1970-01-05 14:00:00      2.2
1970-01-05 15:00:00      2.2
1970-01-05 16:00:00      0.0
1970-01-05 17:00:00      0.0
1970-01-05 18:00:00      0.0
1970-01-05 19:00:00      0.0
1970-01-05 20:00:00      0.0
1970-01-05 21:00:00      0.0
1970-01-05 22:00:00      0.0
1970-01-05 23:00:00      0.0
1970-01-06 00:00:00      0.0
sgibb
  • 25,396
  • 3
  • 68
  • 74
user1537175
  • 95
  • 1
  • 3
  • 6

4 Answers4

5

First, create some reproducible data so we can help you better:

require(xts)
set.seed(1)
X = data.frame(When = as.Date(seq(from = ISOdatetime(2012, 01, 01, 00, 00, 00),
                                  length.out = 100, by="1 hour")),
               Measurements = sample(1:20, 100, replace=TRUE))

We now have a data frame with 100 hourly observations where the dates start at 2012-01-01 00:00:00 and end at 2012-01-05 03:00:00 (time is in 24-hour format).

Second, convert it to an XTS object.

X2 = xts(X$Measurements, order.by=X$When)

Third, learn how to subset a specific time window.

X2['T04:00/T08:00']
#                     [,1]
# 2012-01-01 04:00:00    5
# 2012-01-01 05:00:00   18
# 2012-01-01 06:00:00   19
# 2012-01-01 07:00:00   14
# 2012-01-01 08:00:00   13
# 2012-01-02 04:00:00   18
# 2012-01-02 05:00:00    7
# 2012-01-02 06:00:00   10
# 2012-01-02 07:00:00   12
# 2012-01-02 08:00:00   10
# 2012-01-03 04:00:00    9
# 2012-01-03 05:00:00    5
# 2012-01-03 06:00:00    2
# 2012-01-03 07:00:00    2
# 2012-01-03 08:00:00    7
# 2012-01-04 04:00:00   18
# 2012-01-04 05:00:00    8
# 2012-01-04 06:00:00   16
# 2012-01-04 07:00:00   20
# 2012-01-04 08:00:00    9

Fourth, use that information with apply.daily and whatever function you want, as follows:

apply.daily(X2['T04:00/T08:00'], mean)
#                     [,1]
# 2012-01-01 08:00:00 13.8
# 2012-01-02 08:00:00 11.4
# 2012-01-03 08:00:00  5.0
# 2012-01-04 08:00:00 14.2

Update: Custom endpoints

After re-reading your question, I see that I misinterpreted what you wanted.

It seems that you want to take the mean of a 24 hour period, not necessarily from midnight to midnight.

For this, you should ditch apply.daily and instead, use period.apply with custom endpoints, like this:

# You want to start at 7AM. Find out which record is the first one at 7AM.
A = which(as.character(index(X2)) == "2012-01-01 07:00:00")

# Use that to create your endpoints. 
# The ends of the endpoints should start at 0 
# and end at the max number of records.
ep = c(0, seq(A, 100, by=24), 100)
period.apply(X2, INDEX=ep, FUN=function(x) mean(x))
#                         [,1]
# 2012-01-01 07:00:00 12.62500
# 2012-01-02 07:00:00 10.08333
# 2012-01-03 07:00:00 10.79167
# 2012-01-04 07:00:00 11.54167
# 2012-01-05 03:00:00 10.25000
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
  • @Andrie, thanks, but it seems I had misread the OPs question. They want to aggregate starting from a time on day one and ending on the subsequent day--so basically shifting the start and end of `apply.daily`. Post now updated. – A5C1D2H2I1M1N2O1R2T1 Jul 19 '12 at 12:01
2

Step 1: transform date to POSIXct

ttt <- as.POSIXct("1970-01-05 08:00:00",tz="GMT")
ttt
#"1970-01-05 08:00:00 GMT"

Step 2: substract difftime of 7 hours

ttt <- ttt-as.difftime(7,units="hours")
ttt
#"1970-01-05 01:00:00 GMT"

Step 3: trunc to days

ttt<-trunc(ttt,"days")
ttt
#"1970-01-05 GMT"

Step 4: use plyr, data.table or whatever method you prefer, to calculate daily means

Roland
  • 127,288
  • 10
  • 191
  • 288
2

You can you this code :

fun <- function(s,i,j) { sum(s[i:(i+j-1)]) }
sapply(X=seq(1,24*nb_of_days,24),FUN=fun,s=your_time_serie,j=24)

You just have to change 1 to another value to have different interval of time : 8 of 07:00:00 to 07:00:00 or 13 for 12:00:00 to 12:00:00

Pop
  • 12,135
  • 5
  • 55
  • 68
0

Using regular expressions should get you what you need. Select lines that match your needs and sum the values. Do this for each day within your hour range and you're set.

Dan
  • 3,246
  • 1
  • 32
  • 52