0

I want to compute any type of "moving statistic" on a time series in R, beyond a moving average. For example, how would I compute a moving standard deviation over a time window of length 3?

I've tried the following:

#example data
x <- c(3,9,2,8,4,6,5,8)
#moving standard deviation over a time window of length 3
msd3 <- (cumsum(x^2)-cumsum(Lag(x^2,3)))/((1:length(x))-(Lag(1:length(x),3)))-((cumsum(x)-cumsum(Lag(x,3)))/((1:length(x))-(Lag(1:length(x),3))))^2

But not only does it not work (because the cumsum of the lagged vector gives a vector of all NAs), but I stopped trying to solve that last issue because it seems unnecessarily complicated. Any elegant solution to that problem?

cwarny
  • 997
  • 1
  • 12
  • 27
  • 2
    The last question you asked - http://stackoverflow.com/questions/14926572/moving-average-with-varying-time-window-in-r - (which you didn't mark an answer for I might add) pointed you to the `zoo` library, which is useful for these sorts of things. Please take a look and search before posting - you would have found questions like this one: http://stackoverflow.com/questions/13195442/moving-variance-in-r/13195632#13195632 – thelatemail Feb 17 '13 at 23:05
  • The answer provided by Gavin link pointed to by thelatemail answers your question precisely (and with a small amount of work, the answers to your earlier question today, also pointed by thelatemail): `rollapply(vec, width = 3, FUN = sd)` – Arun Feb 17 '13 at 23:31
  • 1
    Sorry about that. I did do some searching though. I guess I'm gonna delete my question. – cwarny Feb 18 '13 at 15:29

1 Answers1

3

I'd write a separate function to handle this, just to make it easy. For example:

lag_apply <- function(x, n, callback){
    k = length(x);
    result = rep(0, k);
    for(i in 1 : (k - n + 1)){
        result[i] <- callback(x[i :  (i + n -1)]);
    }    
    return(result);
}

> x 
   [1] 2 3 4 5 6 2 2 3 3 4 4
> lag_apply(x, 2, function(x){mean(x)})
   [1] 2.5 3.5 4.5 5.5 4.0 2.0 2.5 3.0 3.5 4.0
> lag_apply(x, 2, function(x){sd(x)})
   [1] 0.7071068 0.7071068 0.7071068 0.7071068 2.8284271 0.0000000 0.7071068 0.0000000 0.7071068 [10] 0.0000000

Now you can use this function with any 'lagging' callback that you can think of. It gets passed the parts of x as a vector of lenght n.

Gijs
  • 10,346
  • 5
  • 27
  • 38
  • Okay I'll leave this up, but actually the rollapply function, as mentioned in the comments to the OQ, is pretty much equivalent, except that you don't need to do any work. I feel stupid. – Gijs Feb 17 '13 at 23:29
  • I was about to write: or `rollapply(..)`. Nevertheless, it answers the OP's question. Just one suggestion: you may want to pre-allocate `result`. And maybe you should replace `mean` with `sd` as it would then directly answer the OP's queston. – Arun Feb 17 '13 at 23:32
  • How should I pre-allocate? – Gijs Feb 18 '13 at 00:00
  • you know the length of `result`. So, you should do: `result <- rep(0, .)` and then inside the for-loop, use index: `result[i]`. This is faster as you dont copy the whole object for every `i`. – Arun Feb 18 '13 at 00:09
  • Thanks for the improvement, I just added it. – Gijs Feb 19 '13 at 23:03