2

This is the data in my text file: (I have shown 10 rows out of 10,000) Index is the rownames, temp is time series and m are the values in mm.

     "Index" "temp" "m"
   1 "2012-02-07 18:15:13" "4297"
   2 "2012-02-07 18:30:04" "4296"
   3 "2012-02-07 18:45:10" "4297"
   4 "2012-02-07 19:00:01" "4297"
   5 "2012-02-07 19:15:07" "4298"
   6 "2012-02-07 19:30:13" "4299"
   7 "2012-02-07 19:45:04" "4299"
   8 "2012-02-07 20:00:10" "4299"
   9 "2012-02-07 20:15:01" "4300"
   10 "2012-02-07 20:30:07" "4301"

Which I import in r using this:

    x2=read.table("data.txt", header=TRUE)

I tried using the following code for aggregating the time series to daily data :

   c=aggregate(ts(x2[, 2], freq = 96), 1, mean)

I have set the frequency to 96 because for 15 min data 24 hrs will be covered in 96 values.

it returns me this:

    Time Series:
   Start = 1 
   End = 5 
   Frequency = 1 
   [1] 5366.698 5325.115 5311.969 5288.542 5331.115

But i want the same format in which I have my original data i.e. I also want the time series next to the values. I need help in achieving that.

rockswap
  • 623
  • 1
  • 7
  • 17
  • 1
    Please make a reproducible example: http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Andrie Jun 25 '12 at 05:59
  • I have edited my question. Let me know if you need any more information. – rockswap Jun 25 '12 at 06:17
  • 1
    maybe you should read the data into a dataframe first then dput the contents. – John Jun 25 '12 at 06:28
  • @John- I'm sorry, but dput produces a very long output and I don't think its useful. You can just copy the content of my text file and use it. – rockswap Jun 25 '12 at 06:39
  • @andrie- you can just copy paste the data I have provided into a text file and use that text file. I tried using dput() but it is producing a very long output. Please help me out here – rockswap Jun 25 '12 at 07:05
  • Subset your original data.frame to 10 lines and dput that, something along the lines of `dput(my.data[1:10, ])`. – Roman Luštrik Jun 25 '12 at 08:42
  • @roman-I did the use the command dput(x) where x is dataframe containing 10 lines of original content, but it produced a very large output which I was unable to paste in this space. Hence, I pasted the 10 lines of content which you can copy to a text file and get the data the way as I have. – rockswap Jun 25 '12 at 17:34

2 Answers2

5

Use the apply.daily from the xts package after converting your data to an xts object:

Something like this should work:

x2 = read.table(header=TRUE, text='     "Index" "temp" "m"
1 "2012-02-07 18:15:13" "4297"
2 "2012-02-07 18:30:04" "4296"
3 "2012-02-07 18:45:10" "4297"
4 "2012-02-07 19:00:01" "4297"
5 "2012-02-07 19:15:07" "4298"
6 "2012-02-07 19:30:13" "4299"
7 "2012-02-07 19:45:04" "4299"
8 "2012-02-07 20:00:10" "4299"
9 "2012-02-07 20:15:01" "4300"
10 "2012-02-07 20:30:07" "4301"')

x2$temp = as.POSIXct(strptime(x2$temp, "%Y-%m-%d %H:%M:%S"))
require(xts)
x2 = xts(x = x2$m, order.by = x2$temp)
apply.daily(x2, mean)
##                       [,1]
## 2012-02-07 20:30:07 4298.3

Update: Your problem in a reproducable format (with fake data)

We don't always need the actual dataset to be able to help troubleshoot....

set.seed(1) # So you can get the same numbers as I do
x = data.frame(datetime = seq(ISOdatetime(1970, 1, 1, 0, 0, 0), 
                              length = 384, by = 900), 
               m = sample(2000:4000, 384, replace = TRUE))
head(x)
#              datetime    m
# 1 1970-01-01 00:00:00 2531
# 2 1970-01-01 00:15:00 2744
# 3 1970-01-01 00:30:00 3146
# 4 1970-01-01 00:45:00 3817
# 5 1970-01-01 01:00:00 2403
# 6 1970-01-01 01:15:00 3797
require(xts)
x2 = xts(x$m, x$datetime)
head(x2)
#                     [,1]
# 1970-01-01 00:00:00 2531
# 1970-01-01 00:15:00 2744
# 1970-01-01 00:30:00 3146
# 1970-01-01 00:45:00 3817
# 1970-01-01 01:00:00 2403
# 1970-01-01 01:15:00 3797
apply.daily(x2, mean)
#                         [,1]
# 1970-01-01 23:45:00 3031.302
# 1970-01-02 23:45:00 3043.250
# 1970-01-03 23:45:00 2896.771
# 1970-01-04 23:45:00 2996.479

Update 2: A workaround alternative

(Using the fake data I've provided in the above update.)

data.frame(time = x[seq(96, nrow(x), by=96), 1],
           mean = aggregate(ts(x[, 2], freq = 96), 1, mean))
#               time     mean
# 1 1970-01-01 23:45 3031.302
# 2 1970-01-02 23:45 3043.250
# 3 1970-01-03 23:45 2896.771
# 4 1970-01-04 23:45 2996.479
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
2

This would be a way to do it in base R:

x2 <- within(x2, {
   temp <- as.POSIXct(temp, format='%Y-%m-%d %H:%M:%S')
   days <- as.POSIXct(cut(temp, breaks='days'))
   m <- as.numeric(m)
})

with(x2, aggregate(m, by=list(days=days), mean))
Matthew Plourde
  • 43,932
  • 7
  • 96
  • 113