2

I have a time series and I wanted to perform average automatically every 1 hour. My data include temperature and date_time (timestamps)
I do not want moving average, I would like to have average for 1, 2, 3, 4, ... o'clock since the frequency of data is usually 2minutes for a single day.

 temperature    date_time
1     -1.52 2007-09-29 00:00:08
2     -1.48 2007-09-29 00:02:08
3     -1.46 2007-09-29 00:04:08
4     -1.56 2007-09-29 00:06:08
5     -1.64 2007-09-29 00:08:08
6     -1.75 2007-09-29 00:10:08
7     -1.74 2007-09-29 00:12:08
8     -2.02 2007-09-29 00:14:08
9     -2.02 2007-09-29 00:16:08
10    -1.90 2007-09-29 00:18:08
11    -1.66 2007-09-29 00:20:08
12    -1.80 2007-09-29 00:22:08
13    -1.68 2007-09-29 00:24:08
14    -1.81 2007-09-29 00:26:08
15    -1.77 2007-09-29 00:28:08
16    -1.83 2007-09-29 00:30:08
17    -1.84 2007-09-29 00:32:08
18    -1.93 2007-09-29 00:34:08
19    -1.62 2007-09-29 00:36:08
20    -1.87 2007-09-29 00:38:08
21    -1.54 2007-09-29 00:40:08
22    -1.93 2007-09-29 00:42:08
23    -1.88 2007-09-29 00:44:08
24    -1.82 2007-09-29 00:46:08
25    -1.78 2007-09-29 00:48:08
26    -1.67 2007-09-29 00:50:08
27    -1.67 2007-09-29 00:52:08
28    -1.56 2007-09-29 00:54:08
29    -1.84 2007-09-29 00:56:08
30    -1.74 2007-09-29 00:58:08
31    -1.79 2007-09-29 01:00:08
32    -1.82 2007-09-29 01:02:08
33    -1.78 2007-09-29 01:04:08
34    -1.88 2007-09-29 01:06:08
35    -1.84 2007-09-29 01:08:08
36    -1.78 2007-09-29 01:10:08
37    -1.94 2007-09-29 01:12:08
38    -1.80 2007-09-29 01:14:08
39    -1.74 2007-09-29 01:16:08
40    -1.76 2007-09-29 01:18:08
41    -1.80 2007-09-29 01:20:08
42    -1.60 2007-09-29 01:22:08
43    -1.59 2007-09-29 01:24:08
44    -1.52 2007-09-29 01:26:08
45    -1.41 2007-09-29 01:28:08
46    -1.42 2007-09-29 01:30:08
47    -1.44 2007-09-29 01:32:08
48    -1.38 2007-09-29 01:34:08
49    -1.34 2007-09-29 01:36:08
50    -1.40 2007-09-29 01:38:08
51    -1.40 2007-09-29 01:40:08
52    -1.48 2007-09-29 01:42:08
53    -1.36 2007-09-29 01:44:08
54    -1.42 2007-09-29 01:46:08
55    -1.46 2007-09-29 01:48:08
56    -1.46 2007-09-29 01:50:08
57    -1.47 2007-09-29 01:52:08
58    -1.50 2007-09-29 01:54:08
59    -1.51 2007-09-29 01:56:08
60    -1.49 2007-09-29 01:58:08
61    -1.54 2007-09-29 02:00:08
62    -1.50 2007-09-29 02:02:08
63    -1.55 2007-09-29 02:04:08
64    -1.52 2007-09-29 02:06:08
65    -1.66 2007-09-29 02:08:08
66    -1.88 2007-09-29 02:10:08
67    -1.72 2007-09-29 02:12:08
68    -1.68 2007-09-29 02:14:08
69    -1.68 2007-09-29 02:16:08
70    -1.60 2007-09-29 02:18:08
71    -1.71 2007-09-29 02:20:08
72    -1.71 2007-09-29 02:22:08
73    -1.80 2007-09-29 02:24:08
74    -1.76 2007-09-29 02:26:08
75    -1.84 2007-09-29 02:28:08
76    -1.96 2007-09-29 02:30:08
77    -2.06 2007-09-29 02:32:08
78    -2.16 2007-09-29 02:34:08
79    -2.04 2007-09-29 02:36:08
80    -1.93 2007-09-29 02:38:08
81    -1.98 2007-09-29 02:40:08
82    -1.86 2007-09-29 02:42:08
83    -2.08 2007-09-29 02:44:08
84    -1.78 2007-09-29 02:46:08
85    -1.50 2007-09-29 02:48:08
86    -1.60 2007-09-29 02:50:08
87    -1.53 2007-09-29 02:52:08
88    -1.76 2007-09-29 02:54:08
89    -1.64 2007-09-29 02:56:08
90    -1.52 2007-09-29 02:58:08
91    -1.82 2007-09-29 03:00:08
agstudy
  • 119,832
  • 17
  • 199
  • 261
A Kntu
  • 689
  • 2
  • 8
  • 13
  • Looking at your data, 1. it doesn't show different dates 2. Do you want average per each hour per day? – bonCodigo Dec 17 '12 at 14:06
  • 2
    [What have you tried](http://mattgemmell.com/2008/12/08/what-have-you-tried/)? – MattLBeck Dec 17 '12 at 14:07
  • The frequency of the data sampling is 2 minute. do u think this is logical to add all the data which is about 5 month to the question? I would like to have average per hour for each day. – A Kntu Dec 17 '12 at 14:09

2 Answers2

11

Assuming your dataset is called temp and that your "date_time" variable is a proper date format (done using, say, as.POSIXlt(temp$date_time), then you can simply use aggregate and cut to get hourly summaries:

aggregate(list(temperature = temp$temperature), 
          list(hourofday = cut(temp$date_time, "1 hour")), 
          mean)
#             hourofday temperature
# 1 2007-09-29 00:00:00   -1.744333
# 2 2007-09-29 01:00:00   -1.586000
# 3 2007-09-29 02:00:00   -1.751667
# 4 2007-09-29 03:00:00   -1.820000
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
  • Would u plz tell me how can I plot it via ggplot? – A Kntu Dec 17 '12 at 15:10
  • 6
    @user1885733, no, I won't. But you can search Stack Overflow for "`[r] ggplot hourly time series plot`" or something similar which will eventually get you to [this question](http://stackoverflow.com/q/13649019/1270695) among other useful questions and answers which would help you figure out how to do what you need to do. – A5C1D2H2I1M1N2O1R2T1 Dec 17 '12 at 16:35
6

Since you manipulate time series , you can use package xts (or zoo,or ts)

Here I assume your data is :

 head(dat)
     V2         V3       V4
2 -1.52 2007-09-29 00:00:08
3 -1.48 2007-09-29 00:02:08
4 -1.46 2007-09-29 00:04:08
5 -1.56 2007-09-29 00:06:08
6 -1.64 2007-09-29 00:08:08
7 -1.75 2007-09-29 00:10:08

First I construct the xts variable

  library(xts)
  dat.xts <- xts(x = dat$V2,as.POSIXct(paste(dat$V3,dat$V4)))


 head(dat.xts)
                     [,1]
2007-09-29 00:00:08 -1.52
2007-09-29 00:02:08 -1.48
2007-09-29 00:04:08 -1.46
2007-09-29 00:06:08 -1.56
2007-09-29 00:08:08 -1.64
2007-09-29 00:10:08 -1.75

Then I use period.apply ,Similar to the rest of the apply family, calculate a specified functions value given a shifting set of data values

ep <- endpoints(dat.xts,'hours')
period.apply(dat.xts,ep,mean)
                         [,1]
2007-09-29 00:58:08 -1.744333
2007-09-29 01:58:08 -1.586000
2007-09-29 02:58:08 -1.751667
2007-09-29 03:00:08 -1.820000

To compute weekly mean for example you just change your ep (endpoint)

ep <- endpoints(dat.xts,'weeks')
period.apply(dat.xts,ep,mean)

                    [,1]
2007-09-29 03:00:08 -1.695385

plot(dat.xts)

enter image description here

agstudy
  • 119,832
  • 17
  • 199
  • 261