7

I want to compute a moving average over a certain time window without generating NAs at the beginning of the time series. For instance, if I set the time window to 3, the 2 first observations will have NAs. What I want is to have a time window of 1 for the first observation, 2 for the second observation, and then 3 for all the remaining observations.

My current code:

#example data
x <- c(3,9,2,8,4,6,5,8)
#moving average with time window of length 3
(ma3 <- filter(x,rep(1/3,3),sides=1))
cwarny
  • 997
  • 1
  • 12
  • 27

6 Answers6

5

I don't see a way other than brute-force:

Using rollapply from package zoo instead of filter:

c(x[1], mean(x[1:2]), rollapply(x, width=3, FUN=mean))
Matthew Lundberg
  • 42,009
  • 6
  • 90
  • 112
3

Let me jump on the rollapply train, too:

> rollapply(c(NA, NA, x), width=3, FUN=mean, na.rm=T)
[1] 3.000000 6.000000 4.666667 6.333333 4.666667 6.000000 5.000000 6.333333

Prepending two = 3-1 NA values and using na.rm=T has the effect of extending the time series but ignoring the new values for calculating the mean. A slightly more difficult but otherwise equivalent syntax

> rollapply(c(NA, NA, x), width=3, FUN=function(v) mean(v, na.rm=T))

Thanks to Matthew for pointing this out.

krlmlr
  • 25,056
  • 14
  • 120
  • 217
2

@thelatemail has done a great job, but he had an error in the code (test[] should be replaced with x[] inside the function) and more importantly he had to do the same thing for the end of the vector (if you want side=2). Also the window size should be twice+1 of the ith element in the vector (and n-ith element at the end). so, here is the final version:

movavg.grow = function(x,window) {
  startma = sapply(1:(floor(window/2)),function(y) mean(x[1:((y-1)*2+1)]))
  endma = sapply(1:(floor(window/2)),function(y) mean(x[(length(x)-((y-1)*2)):length(x)]))
  endma = rev(endma)
  c(startma,
    filter(x,rep(1/window,window))[(floor(window/2):(length(x)- floor(window)/2)+1)],
    endma)
}

As for a test, what you want must return 1:10 for x=1:10

> x=1:10
> x
 [1]  1  2  3  4  5  6  7  8  9 10
> movavg.grow(x,5)
 [1]  1  2  3  4  5  6  7  8  9 10
> movavg.grow(x,3)
 [1]  1  2  3  4  5  6  7  8  9 10
2

Feature you are asking is called 'partial' window and AFAIK it is already available in zoo package.

There is also new fast rolling mean function in data.table to be released in 1.12.0.
Unfortunately it does not support partial window, but you can achieve desired behavior using 'adaptive' feature of that function in the following way:

x = c(3,9,2,8,4,6,5,8)
window = 3

library(data.table)
n = c(seq.int(window), rep(window, length(x)-window))
frollmean(x, n, adaptive=TRUE)
#[1] 3.000000 6.000000 4.666667 6.333333 4.666667 6.000000 5.000000 6.333333

You can find manual entry for new function online at ?froll.

jangorecki
  • 16,384
  • 4
  • 79
  • 160
1

Add zero's to the beginning and ending of your sequence with the size of the moving average. This will prevent NAs.

Daniel
  • 5,839
  • 9
  • 46
  • 85
1

A custom function in base R to get you there:

movavg.grow <- function(x,window,sides) {
 startma <- sapply(1:(window-1),function(y) mean(x[1:y]))
 c(startma,filter(x,rep(1/window,window),sides=sides)[window:length(x)])
}

Test it:

> test <- c(3,9,2,8,4,6,5,8)
> movavg.grow(x=test,window=3,sides=1)
[1] 3.000000 6.000000 4.666667 6.333333 4.666667 6.000000 5.000000 6.333333
thelatemail
  • 91,185
  • 12
  • 128
  • 188