Assuming that you read the data in your .csv
file as a data frame df
, one approach to your problem is to use rollapply
from the zoo
package to give you a rolling sum:
library(zoo)
ind_keep <- seq(1,floor(nrow(df)/5)*5, by=5) ## 1.
out <- sapply(df[,-1], function(x) rollapply(x,6,sum)) ## 2.
out <- data.frame(df[ind_keep+5,1],out[ind_keep,]) ## 3.
colnames(out) <- c("Day_and_time","Rain1_mm/5min","Rain2_mm/5min") ## 4.
Notes:
- Here, we define the indices corresponding to every 5 minutes where we want to keep the rolling sum over the next 5 minutes.
- Apply a rolling sum function for each column.
- Use
sapply
over all columns of df
that is not the first column. Note that the column indices specified in df[,-1]
can be adjusted so that you process only certain columns.
- The function to apply is
rollapply
from the zoo
package. The additional arguments are the width of the window 5
and the sum
function so that this performs a rolling sum.
At this point, out
contains the rolling sums (over 5 minutes) at each minute, but we only want those every 5 minutes. Therefore,
- Combines the
Day_and_time
column from the original df
with out
keeping only those columns every 5 minutes. Note that we keep the last Day_and_Time
in each window.
- This just renames the columns.
Using MikeyMike's data, which is
Day_and_Time rain1 rain2
1 2010-02-12 01:00:00 0.03 0.00
2 2010-02-12 01:01:00 0.03 0.00
3 2010-02-12 01:02:00 0.01 0.00
4 2010-02-12 01:03:00 0.05 0.00
5 2010-02-12 01:04:00 0.03 0.10
6 2010-02-12 01:05:00 0.04 0.00
7 2010-02-12 01:06:00 0.02 0.10
8 2010-02-12 01:07:00 0.10 0.10
9 2010-02-12 01:08:00 0.30 0.00
10 2010-02-12 01:09:00 0.01 0.00
11 2010-02-12 01:10:00 0.00 0.01
this gives:
print(out)
## Day_and_time Rain1_mm/5min Rain2_mm/5min
##1 2010-02-12 01:05:00 0.19 0.10
##2 2010-02-12 01:10:00 0.47 0.21
Note the difference in the result, this approach assumes you want overlapping windows since you specified that you want to sum the six numbers between the closed interval [i,i+5]
at each 5 minute mark.
To extend the above to a window in the closed interval [i, i+nMin]
at each nMin
mark:
library(zoo)
nMin <- 10 ## for example 10 minutes
ind_keep <- seq(1, floor(nrow(df)/nMin)*nMin, by=nMin)
out <- sapply(df[,-1], function(x) rollapply(x, nMin+1, sum))
out <- data.frame(df[ind_keep+nMin, 1],out[ind_keep,])
colnames(out) <- c("Day_and_time",paste0("Rain1_mm/",nMin,"min"),paste0("Rain2_mm/",nMin,"min"))
For this to work, the data must have at least 2 * nMin + 1
rows
Hope this helps.