0

I need to calculate a moving average and standard deviation for a moving window. This is simple enough with the catools package!

... However, what i would like to do, is having defined my moving window, i want to take an average from ONLY those values within the window, whose corresponding values of other variables meet certain criteria. For example, I would like to calculate a moving Temperature average, using only the values within the window (e.g. +/- 2 days), when say Relative Humidity is above 80%.

Could anybody help point me in the right direction? Here is some example data:

da <- data.frame(matrix(c(12,15,12,13,8,20,18,19,20,80,79,91,92,70,94,80,80,90), 
               ncol = 2, byrow = TRUE))

names(da) = c("Temp", "RH") 

Thanks,

Brad

user1959078
  • 25
  • 1
  • 2
  • 5
  • 1
    Welcome to SO! Please add a [minimal, reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610) and please show us [what you have tried](http://meta.stackoverflow.com/help/how-to-ask). Then you will be much more likely to receive a rapid, helpful answer. Cheers. – Henrik Sep 10 '13 at 07:53
  • Thanks Henrik! So here's an exmaple of data and say i want to make my moving window size 3 steps da= data.frame(matrix(c(12,15,12,13,8,20,18,19,20,80,79,91,92,70,94,80,80,90), ncol = 2, byrow = TRUE)) names(da) = c("Temp", "RH") – user1959078 Sep 10 '13 at 08:19
  • Click 'edit' under your question and include this as part of the question, not in a comment. – Simon O'Hanlon Sep 10 '13 at 08:28
  • Thanks for the example! You may also wish to have a look [here](http://meta.stackexchange.com/questions/22186/how-do-i-format-my-code-blocks) on how to format code in a nice way in questions, answers and comments. – Henrik Sep 10 '13 at 08:28

1 Answers1

0

I haven't used catools, but in the help text for the (presumably) most relevant function in that package, ?runmean, you see that x, the input data, can be either "a numeric vector [...] or matrix with n rows". In your case the matrix alternative is most relevant - you wish to calculate mean of a focal variable, Temp, conditional on a second variable, RH, and the function needs access to both variables. However, "[i]f x is a matrix than each column will be processed separately". Thus, I don't think catools can solve your problem. Instead, I would suggest rollapply in the zoo package. In rollapply, you have the argument by.column. Default is TRUE: "If TRUE, FUN is applied to each column separately". However, as explained above we need access to both columns in the function, and set by.column to FALSE.

# First, specify a function to apply to each window: mean of Temp where RH > 80
meanfun <- function(x) mean(x[(x[ , "RH"] > 80), "Temp"])

# Apply the function to windows of size 3 in your data 'da'.
meanTemp <- rollapply(data = da, width = 3, FUN = meanfun, by.column = FALSE)
meanTemp

# If you want to add the means to 'da', 
# you need to make it the same length as number of rows in 'da'.
# This can be acheived by the `fill` argument,
# where we can pad the resulting vector of running means with NA
meanTemp <- rollapply(data = da, width = 3, FUN = meanfun, by.column = FALSE, fill = NA)

# Add the vector of means to the data frame
da2 <- cbind(da, meanTemp)
da2

# even smaller example to make it easier to see how the function works
da <- data.frame(Temp = 1:9, RH = rep(c(80, 81, 80), each = 3))
meanTemp <- rollapply(data = da, width = 3, FUN = meanfun, by.column = FALSE, fill = NA)
da2 <- cbind(da, meanTemp)
da2

#     Temp RH meanTemp
# 1    1 80       NA
# 2    2 80      NaN
# 3    3 80      4.0
# 4    4 81      4.5
# 5    5 81      5.0
# 6    6 81      5.5
# 7    7 80      6.0
# 8    8 80      NaN
# 9    9 80       NA
Henrik
  • 65,555
  • 14
  • 143
  • 159