6

Most packages and posts I found apply mean to a fixed size window or the aggregate month/week data. Is it possible to calculate rolling k month average?

For example, for 1 month rolling window, assuming the data is:

Date          Value
2012-05-28    101
2012-05-25     99
2012-05-24    102
....
2012-04-30     78
2012-04-27     82
2012-04-26     77
2012-04-25     75
2012-04-24     76

The first three rolling 1 month windows should be:

1. 2012-05-28 to 2012-04-30
2. 2012-05-25 to 2012-04-26
3. 2012-05-24 to 2012-04-25

Please note that this is NOT the fixed width rolling window. The window actually changes on the daily basis.

BlueSun
  • 3,541
  • 1
  • 18
  • 37
Alpha
  • 2,372
  • 3
  • 21
  • 23
  • 1
    Take a look at `rollmean` from the `zoo` package. – Justin May 29 '12 at 16:24
  • Look here: http://stackoverflow.com/q/743812/602276 – Andrie May 29 '12 at 16:25
  • 1
    thanks for the quick reply. but those only apply to FIXED rolling windows. – Alpha May 29 '12 at 16:27
  • 2
    possible duplicate of [How to make a set containing count of data in rolling set of buckets](http://stackoverflow.com/questions/10741180/how-to-make-a-set-containing-count-of-data-in-rolling-set-of-buckets) – Andrie May 29 '12 at 16:29
  • 2
    how are you defining a month? look back 30 days? look back to the previous day of the month? I'm not sure I understand how the window width is changing... – Justin May 29 '12 at 16:37
  • Related: [Rolling list over unequal times in xts](http://stackoverflow.com/questions/10722512/rolling-list-over-unequal-times-in-xts), [Rolling window over irregular time series](http://stackoverflow.com/questions/10556450/rolling-window-over-irregular-time-series) – Joshua Ulrich May 29 '12 at 16:40
  • @Andrie this is close, but the endpoints looks weird though. it does not cut off the exact month-ends. – Alpha May 29 '12 at 16:47
  • @Justin one month relative to date: [yyyy-mm-dd] is [yyyy-(mm-1)-dd + 1], where (+1) is used to exclude the end points. – Alpha May 29 '12 at 16:50
  • @JoshuaUlrich i think this is possible, but looping into every element of the series would definitely be my last resort – Alpha May 29 '12 at 16:53
  • So one month ago from `2012-05-30` is `2012-04-31`? Awesome! –  May 29 '12 at 17:15
  • @JackManey should be 2012-05-01... date+1 can automatically do the month/year increment – Alpha May 29 '12 at 23:01
  • If `dd` is simply an integer, then there is nothing that inherently prevents `dd>31`. –  May 29 '12 at 23:12
  • @JackManey sorry, maybe i didn't clarify enough. what i mean by +1 to a date is actually as.Date('yyyy-mm-dd') + 1. anyway, i'm just trying to show the idea, not the exact expression or anything – Alpha May 30 '12 at 02:30

3 Answers3

1

I used this code to calculate monthly averages based on daily price data.

#function for extracting month is in the lubridate package
install.packages(c("plyr", "lubridate"))
require(plyr); require(lubridate)

#read the daily data
daily = read.csv("daily_lumber_prices.csv")
price = daily$Open
date = daily$Date

#convert date to a usable format
date = strptime(date, "%d-%b-%y")
mon = month(date)
T = length(price)

#need to know when months change
change_month = rep(0,T)

for(t in 2:T){
  if(mon[t] != mon[t-1]){
    change_month[t-1] = 1
  }
}

month_avg = rep(0,T)
total = 0
days = 0

for(t in 1:T){
  if(change_month[t] == 0){
    #cumulative sums for each variable
    total = total + price[t] 
    days = days + 1
  }

  else{
    #need to include the current month in the calculation
    month_avg[t] = (total + price[t]) / (days + 1)
    #reset the variables
    total = 0
    days = 0
  }
}

So, the variable month_avg is storing the monthly averages.

Is it something like this? This code accounts for the variable lengths of months. There's certainly a more efficient way to do it, but this works!

wcampbell
  • 791
  • 6
  • 8
0

Assuming your data frame is df this works for me:

df$past_avg = sapply(df$Date, function(i){
    i = as.POSIXct(i)
    mean(subset(df, Date > (i - months(1)) & Date < i)$Value)
})

Uses just base R. You can adjust to however many months in the past you want by changing the value in months().

CephBirk
  • 6,422
  • 5
  • 56
  • 74
0

runner package fully supports rolling windows operations on irregulary spaced time series. To calculate 1-month moving average on x object one have to specify idx = date (to make a runner time dependent) and k = "1 months" or k = 30 (days) depending what is more important to user. User can apply any R function - in this case we execute mean.

# example data
x <- cumsum(rnorm(20))
date <- Sys.Date() + cumsum(sample(1:5, 20, replace = TRUE)) # unequaly spaced time series

# calculate rolling average
runner::runner(
  x = x, 
  k = "1 months", 
  idx = date, 
  f = mean
)
GoGonzo
  • 2,637
  • 1
  • 18
  • 25