2

I have an xts object, and I wish to create weighted sums of the columns (and do this a LOT). By far the easiest way is matrix multiplication, but then the result loses the nice xts qualities.

It's easy to add them back by creating a new xts object -- but it's both slow and tedious.

For example:

dd <- xts(matrix(rnorm(200), ncol=2), Sys.Date() + 1:100)
w_sum <- dd %*% c(-1, 1)

... and the problem is:

> tail(w_sum)
             [,1]
 [95,]  0.1758262
 [96,] -0.3310975
 [97,] -0.1204836
 [98,] -1.2242001
 [99,] -1.7333222
[100,]  1.1216603

The fix is:

w_sumx <- xts(dd %*% c(-1, 1), index(dd))

But not only is it bothersome, it's slow. Also, i note with interest that xts is really fast for subtraction. Is there a way to do this which leverages the fast internals of xts?

f1 <- function() xts(dd %*% c(-1, 1), index(dd))
f2 <- function() dd[,2] - dd[,1]

> microbenchmark::microbenchmark(f1(), f2(), times = 1000)
Unit: microseconds
 expr  min   lq     mean median     uq    max neval cld
 f1() 83.7 97.3 114.1294 104.65 115.00 6688.4  1000   b
 f2() 26.3 34.0  40.6202  38.85  45.15  155.4  1000  a 
ricardo
  • 8,195
  • 7
  • 47
  • 69
  • It is feasible to keep `w_sum` as a column within `dd`? If so, `dd$w_sum <- dd %*% c(-1, 1)` might be fastest (though you need named columns of `dd` for this to work, which you've not got in your example but might in real life). – Andrew Gustar Feb 15 '20 at 10:49
  • It’s possible to set things up that way — but I’d prefer that it works without doing so. It occurred to me that i could overload %*% and make it fast with Rcpp ... though that seems like a sledgehammer – ricardo Feb 15 '20 at 11:36
  • `w_sum <- Reduce("+", as.list(dd * rep(c(-1, 1), each = nrow(dd))))` is a little faster than `f1` - for some reason it manages to preserve the `xts` class. – Andrew Gustar Feb 15 '20 at 12:16
  • 2
    This seems faster. `replace(dd[, 1], 1:nrow(dd), dd %*% c(-1, 1))` – G. Grothendieck Feb 15 '20 at 15:00

1 Answers1

2

A few simple alternatives exists. Obviously you could rewrite the method in Rcpp as suggested, but a simpler alternative is just to overwrite the attributes after performing matrix regular multiplication.

dd_new <- dd %*% c(-1, 1)
att <- attributes(dd)
att$dim <- dim(dd_new)
attributes(dd_new) <- att

This is not as fast as pure matrix multiplication, but is about 10 - 13x faster than subsetting the time series itself.

microbenchmark::microbenchmark(xts = dd[, 1] - dd[, 2], 
                               matmult = dd %*% c(1, -1),
                               xtsmatmult = xts(dd %*% c(1, -1), index(dd)),
                               "%.%" = dd %.% c(1, -1),
                               "%-%" = dd %-% c(1, -1),
                               times = 1e5)
Unit: milliseconds
       expr    min     lq    mean median     uq    max neval
        xts 0.0396 0.0685 0.11200 0.0998 0.1170  15.40 1e+05
    matmult 0.0008 0.0021 0.00352 0.0028 0.0040   7.71 1e+05
 xtsmatmult 0.0853 0.1380 0.22900 0.2100 0.2300 117.00 1e+05
        %.% 0.0025 0.0055 0.00905 0.0076 0.0099   8.97 1e+05
        %-% 0.0096 0.0183 0.03030 0.0268 0.0318 101.00 1e+05

In the above %.% is a barebone function leaving only doing the matrix multiplication and overwriting the attributes, while %-% adds some simple input-checks, to ensure that dimensions are acceptable, and using S3 class style, in order to simplify generalizations.

Functions:

note that the compiler::cmpfun function has been used to byte-compile the functions (similar to a package function). In this case the effect is insignificant.

`%.%` <- compiler::cmpfun(function(x, z){
    x2 <- x %*% z
    att <- attributes(x)
    att$dim <- dim(x2)
    attributes(x2) <- att
    x2
})
`%-%` <- function(x, z)
    UseMethod('%-%')
`%-%.xts` <- compiler::cmpfun(function(x, z){
    ## 
    if(!is.xts(x))
        stop('x must be an xts object')
    if(!is.numeric(z) || !(n <- length(z)) == ncol(x) || n == 0)
        stop('z must be an index vector')
    x2 <- x %*% z
    att <- attributes(x)
    att$dim <- dim(x2)
    attributes(x2) <- att
    x2
})
Community
  • 1
  • 1
Oliver
  • 8,169
  • 3
  • 15
  • 37
  • Must admit I don't understand the downvote for my answer, as it seems to take into account every piece of the question asked while keeping all the methods available to the result. – Oliver Feb 15 '20 at 22:15
  • I found that the code is broken. i get the error message `Error: evaluation nested too deeply: infinite recursion / options(expressions=)?` from both `%.%` and `%-%`. – ricardo Feb 17 '20 at 00:03
  • Interesting. There is no recursion going on with the two functions. The error message is usually caused by code running forever (this [answer](https://stackoverflow.com/a/16832933/10782538) explains it pretty nicely). My guess is that another part of your code is causing an almost-infinite loop (you said you had to do this "ALOT", the question is if this ALOT causes infinite recursion). – Oliver Feb 17 '20 at 06:20
  • oh i see; i wrote it up as `%*%.xts` ... is it possible that in my version it's calling itself in the `x2 <- x %*% z` step? – ricardo Feb 17 '20 at 07:15
  • That is very much possible. Try with a fresh R-session and comment out the `%*%.xts` function. (shortcut in R-studio for multi line comment should be ctrl+shift+C). hopefuly this should fix the problem. – Oliver Feb 17 '20 at 07:24
  • `rm(list = ls())` should create an almost 'fresh' R-session. – Oliver Feb 17 '20 at 07:25