I have the following data:
y <- data.table(cbind(week = rep(1:61,5352),
ID = rep(1:5352, each = 61), w = runif(326472), v = runif(326472)))
y$v[sample(1:326472, 10000, replace=FALSE)] <- NA
for which I'm running the code bellow that creates a rolling mean of variable v, ignoring outliers and NAs. The code is working, but with poor perfomance. I'm sure there is more efficient way to run it using apply or something similar, but I've been unsuccessful in creating a faster version. Can anyone shed some light on how to make it more efficient?
IDs <- unique(y$ID)
y$vol_m12 <- 0
for (i in 1:length(IDs)) {
x <- y[ID==IDs[i]]
outlier <- 0.2
w_outlier <- quantile(x$w, c(outlier), na.rm = T)
v_outlier <-quantile(x$v, c(1 - outlier), na.rm = T)
# Ignore outliers
x$v_temp <- x$v
x$v_temp[((x$v_temp >= v_outlier)
& (x$w <= w_outlier))] <- NA
# Creating rolling mean
y$vol_m12[y$ID==IDs[i]] <- x[, rollapplyr(v_temp, 12, (mean), fill = NA, na.rm=T)]
}