4

I have been constructing a function for centered moving average in R (without using any packages), and have encountered a challenge as below:

As you know, the centered moving average includes the concept of incorporating the 'incomplete portions' (i.e. at the beginning and the end of the datapoint). For example, consider below vector p:

p <- c(10,20,30,40,50,60,70,80,90)

In this case, centered moving average that I am interested in looks like this:

x <- ((10+20)/2, (10+20+30)/3, (20+30+40)/3 ..... (70+80+90)/3, (80+90)/2)

To achieve above, I tried function with if function as below:

wd means window size

mov_avg <- function(p, wd) {
  x <- c(0, cumsum(p))
  if ((p > p[1])&(p < p[length(p)])) {
    neut <- 1:(length(p)-(wd-1))
    upper <- neut+(wd-1)
    x <- (x[upper]-x[neut])/(upper-neut)
  } else if (p==p[1]) {
    neut <- 0
    upper <- neut+3
    x <- (x[upper]-x[neut])/(upper-1-neut)
  } else if (p==p[length(p)]) {
    upper <-(length(p)+1)
    neut <- (length(p)-(wd-2))
    x <- (x[upper]-x[neut])/(upper-neut)
  }
  return(x)
}

Then I entered below line to execute:

mov_avg(p, 3)

I encountered errors as below:

numeric(0)
Warning messages:
1: In if ((p > p[1]) & (p < p[length(p)])) { :
  the condition has length > 1 and only the first element will be used
2: In if (p == p[1]) { :
  the condition has length > 1 and only the first element will be used

Could someone help me out in making this a working function?

Thank you!

CKE
  • 1,533
  • 19
  • 18
  • 29
Dugong
  • 67
  • 1
  • 9
  • Your warnings come from the fact that you compare in your `if()`-clause the whole vector `p` with a single value. So you need some adjustment there. Actually, what do you want to test there or in other words, what are the three cases you are trying to catch there? Not sure and tastes are different, but I think also that the cumsum might not be the easiest way to achieve your goal. – Daniel Fischer Jul 23 '18 at 05:12
  • Thank you very much for your comment Daniel. I am trying to make a probabilistic trajectory of event on a time series, and underway of developing several different moving average models, including this. Others will be TMA (triangular moving average) and EMA (exponential moving average).. – Dugong Jul 23 '18 at 05:20
  • Sorry, that was maybe a misunderstanding, I meant what are your three if-cases? To me it seems a bit like, the first should fetch the case, when you are 'in the middle' and the other two should fetch the cases when you are at the borders?! But this would not work, as you just go once through the whole thing in your function and you just create a single `x`-vector. So if you want to work still with your function (instead of the already suggested solutions), I think you would need still a loop around you if-clauses and then you would need to combine the output rather than just overwrite it. – Daniel Fischer Jul 23 '18 at 05:25
  • Yes Daniel, that's what I wanted to do.. you got it. Also, I understand that I need to create a loop. I will try both to see. :) Thanks again! – Dugong Jul 23 '18 at 05:28
  • 1
    Maybe just for the sake of completion, you could also check out this old thread https://stackoverflow.com/questions/743812/calculating-moving-average – Daniel Fischer Jul 23 '18 at 06:02

4 Answers4

3

We could also use rowMeans

rowMeans(embed(c(NA, p, NA),  3)[, 3:1], na.rm = TRUE)
#[1] 15 20 30 40 50 60 70 80 85
akrun
  • 874,273
  • 37
  • 540
  • 662
2

How about something like this in base R:

window <- 3
p <- c(10,20,30,40,50,60,70,80,90)

x <- c(NA, p, NA)
sapply(seq_along(x[-(1:(window - 1))]), function(i)
    mean(x[seq(i, i + window - 1)], na.rm = T))
#[1] 15 20 30 40 50 60 70 80 85

The trick is to add flanking NAs and then use mean with na.rm = T.


I know you said "without using packages", but the same is even shorter using zoo::rollapply

library(zoo)
rollapply(c(NA, p, NA), 3, mean, na.rm = T)
#[1] 15 20 30 40 50 60 70 80 85
Maurits Evers
  • 49,617
  • 4
  • 47
  • 68
  • 1
    There is also `rollmean` function which might be useful. – Ronak Shah Jul 23 '18 at 05:11
  • Thanks so much Maurits! That's an elegant way to solve the problem I guess! Also, appreciate introducing the package, Ronak and Maurits. I will try to use that as well! – Dugong Jul 23 '18 at 05:26
1

Another method is to create a function where we can adjust with variable windows

mov_avg <- function(p, window) {
 mean_number = numeric()
 index = 1
 while(index < length(p)) {
   if (index == 1 | index == length(p) - 1) 
    mean_number = c(mean_number, mean(p[index:(index + window - 2)]))
   else 
    mean_number = c(mean_number, mean(p[index:(index + window - 1)]))
  index = index + 1
  }
  mean_number
}

mov_avg(p, 3)
#[1] 15 30 40 50 60 70 80 85

mov_avg(p, 2)
#[1] 10 25 35 45 55 65 75 80
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • 1
    This is what I guess the closest I thought to create. Thank you! I would love to learn to use the loop elegantly. :) – Dugong Jul 23 '18 at 05:31
1

Take the mean by rows in a matrix with columns that are x, and the head and tail appended with the means respectively of the first two and last two elements.

apply( matrix( c(x, 
               c( x[1]+x[2])/2, head(x,-1) ),
               c( tail(x,-1), sum( tail(x,2))/2)  ),
               ncol = 3),
       1, mean)
IRTFM
  • 258,963
  • 21
  • 364
  • 487