6

Say i have two columns in a dataframe/data.table, one the level and the other one volume. I want to compute a rolling average of the level, weighted by volume, so volume acts as weight (normalized to 1) for some rolling window.

Base R has a weighted.mean() function which does similar calculation for two static vectors. I tried using sapply to pass a list/vector fo argument to it and create a rollign series, but to no avail.

Which "apply" mechanism should i use with weighted.mean() to get the desired result, or i would have to loop/write my own function?

////////////////////////////////////////////////////////////////////////////////////////

P.S. in the end i settled on writing simple custom function, which utilizes the great RccpRoll package. I found RccpRoll to be wicked fast, much faster than other rolling methods, which is important to me, as my data is several million rows.

the code for the function looks like this(i've added some NAs in the beggining since RccpRoll returns data without NAs):

require(RcppRoll)
my.rollmean.weighted <- function(vec1,vec2,width){
   return(c(rep(NA,width-1),roll_sum(vec1*vec2,width)/roll_sum(vec2,width)))
}
flipper
  • 531
  • 4
  • 13
  • see if `rollapply?` if suits your needs – Silence Dogood Jul 09 '14 at 06:56
  • rollapply is a part of Zoo, will it work on non-Zoo objects? it seems to use a vertion of rollapply i would to have to convert everything to zoo and then back. and also the question how to roll the fuction which uses two vectors as input is still open - by default rollaplly uses only one input, or am i wrong? – flipper Jul 09 '14 at 06:59
  • Have you looked at the package `RcppRoll`? – erc Jul 09 '14 at 07:59
  • No, thanks, i'll check it out. – flipper Jul 09 '14 at 08:17
  • you might be interested in: http://stackoverflow.com/questions/21368245/performance-of-rolling-window-functions-in-r – jangorecki Jul 09 '14 at 08:53
  • 1
    I might be missing something, but to me it looks like all these functions use 1 argument to roll over, as in X is argument everything else is a parameter, while here i need a function of multiple(2) arguments, which are related ina computation, not like you pass 2 columns and it is computed column by column. Ofcourse i could first create a column/vector of product weight*level and then apply roll sum to it and divide by rolling sum of weights - it is not a rocket science. I was looking for more elegant solution. RccpRoll looks great, thanks! – flipper Jul 09 '14 at 09:08
  • rollapply does **not** require that you convert the objects acted on to zoo. – G. Grothendieck Jul 09 '14 at 10:35
  • @flipper, Its **not** true that rollapply only works column by column. – G. Grothendieck Jul 09 '14 at 16:03
  • Again thanks for the RcppRoll package mention. Blazing fast and convinient! – flipper Jul 10 '14 at 12:24
  • @flipper Which function of the `RcppRoll` package did you use to compute a weighted rolling average ? – Julien Jun 16 '22 at 14:18
  • I have just found the `RcppRoll::roll_mean` function. Does it work with dataframe like `rollapply` ? – Julien Jun 16 '22 at 14:26

2 Answers2

6

I think this might work. It employs the technique demonstrated in the rollapply documentation for rolling regression. The key is by.column=FALSE. This provides a matrix of all the columns on a rolling basis.

  require(zoo)

  df <- data.frame(
    price = cumprod(1 + runif(1000,-0.03,0.03)) * 25,
    volume = runif(1000,1e6,2e6)
  )

  rollapply(
    df,
    width = 50,
    function(z){
      #uncomment if you want to see the structure
      #print(head(z,3))
      return(
        weighted_mean = weighted.mean(z[,"price"],z[,"volume"])
      )
    },
    by.column = FALSE,
    align = "right"
  )

Let me know if it doesn't work or is not clear.

timelyportfolio
  • 6,479
  • 30
  • 33
  • 1
    Thanks a lot. I get the idea, but does not work for me. Looking at debug - For some reason it tries to subset the original datatable by columns in chunks of 10 and ofcourse it just misses the relevant columns in certain subsets. Again may be i am missing something:( I guess i will just have to use an additional column and be done with it this way. – flipper Jul 10 '14 at 11:45
  • How to use `partial = TRUE` inside this code? – Julien Feb 07 '23 at 16:41
-1

Here is a code snippet that might help. It uses the rollmean function from the zoo package, and intervals of two (you pick the interval). The variable you would calculate using the weighted.mean function, I assume:

library(zoo) # for the rollmean() function

movavg <- rollmean(df$weightedVariable, k = 2, align = "right")
lawyeR
  • 7,488
  • 5
  • 33
  • 63