My data is 988, 785 obs. of 3 variables. A smaller example of my data is below:
Names <- c("Jack", "Jill", "John")
RawAccelData <- data.frame(
Sample = as.numeric(rep(1:60000, each = 3)),
Acceleration = rnorm(6000),
ID = rep((Names), each = 60000)
)
The sample rate of my equipment is 100 Hz. I wish to calculate a rolling average of Acceleration
for each ID
over a 1 to 10 second period. I perform this using the following:
require(dplyr)
require(zoo)
for (summaryFunction in c("mean")) {
for ( i in seq(100, 1000, by = 100)) {
tempColumn <- RawAccelData %>%
group_by(ID) %>%
transmute(rollapply(Acceleration,
width = i,
FUN = summaryFunction,
align = "right",
fill = NA,
na.rm = T))
colnames(tempColumn)[2] <- paste("Rolling", summaryFunction, as.character(i), sep = ".")
RawAccelData <- bind_cols(RawAccelData, tempColumn[2])
}
}
However, I now need to calculate a rolling over a 1 to 10 minute period. I can do this by using the above code and substituting in the following line:
for ( i in seq(6000, 60000, by = 6000)) {
However, this takes hours to run through my dataset and results in RStudio on my Mac (details below) hanging! Is there a way I can a) tidy up the above code or b) use a different package/ method to enable a quicker result?
Thank you.
R version 3.2.3 (2015-12-10)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.5 (Yosemite)
locale:
[1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] zoo_1.7-12 dplyr_0.4.3
loaded via a namespace (and not attached):
[1] lazyeval_0.1.10 magrittr_1.5 R6_2.1.1 assertthat_0.1 parallel_3.2.3 DBI_0.3.1
[7] tools_3.2.3 Rcpp_0.12.2 grid_3.2.3 lattice_0.20-33