1

Supposing I need to apply an MA(5) to a batch of market data, stored in an xts object. I can easily pull the subset of data I wanted smoothed with xts subsetting:

x['2013-12-05 17:00:01/2013-12-06 17:00:00']

However, I need an additional 5 observations prior to the first one in my subset to "prime" the filter. Is there an easy way to do this?

The only thing I have been able to figure out is really ugly, with explicit row numbers (here using xts sample data):

require(xts)
data(sample_matrix)
x <- as.xts(sample_matrix)

x$rn <- row(x[,1])
frst <- first(x['2007-05-18'])$rn
finl <- last(x['2007-06-09'])$rn
ans <- x[(frst-5):finl,]

Can I just say bleah? Somebody help me.

UPDATE: by popular request, a short example that applies an MA(5) to the daily data in sample_matrix:

require(xts)
data(sample_matrix)
x <- as.xts(sample_matrix)$Close

calc_weights <- function(x) {
    ##replace rnorm with sophisticated analysis
    wgts <- matrix(rnorm(5,0,0.5), nrow=1)
    xts(wgts, index(last(x)))
}

smooth_days <- function(x, wgts) {
    w <- wgts[index(last(x))]
    out <- filter(x, w, sides=1)
    xts(out, index(x))
}

set.seed(1.23456789)
wgts <- apply.weekly(x, calc_weights)
lapply(split(x, f='weeks'), smooth_days, wgts)

For brevity, only the final week's output:

[[26]]
                [,1]
2007-06-25        NA
2007-06-26        NA
2007-06-27        NA
2007-06-28        NA
2007-06-29 -9.581503
2007-06-30 -9.581208

The NAs here are my problem. I want to recalculate my weights for each week of data, and apply those new weights to the upcoming week. Rinse, repeat. In real life, I replace the lapply with some ugly stuff with row indexes, but I'm sure there's a better way.

In an attempt to define the problem clearly, this appears to be a conflict between the desire to run an analysis on non-overlapping time periods (weeks, in this case) but requiring overlapping time periods of data (2 weeks, in this case) to perform the calculation.

Community
  • 1
  • 1
khoxsey
  • 1,405
  • 9
  • 13
  • Your use-case isn't clear. Can you provide a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) of what you're actually trying to do? You could use the `which.i` arg to `[.xts`: `ans <- x[(x['2007-05-18',which.i=TRUE]-15):x['2007-06-09',which.i=TRUE]]`, but there's probably a better way. – Joshua Ulrich Dec 08 '13 at 17:29
  • `which.i` is exactly what I was wondering about. I'm sure you're right that there is a better way, but at least this provides me traction. If you make that comment an answer I'll accept it. – khoxsey Dec 08 '13 at 18:38
  • Even with Joshua's comment I'm still not sure. Do you mean your data goes from say, Jan 1 2013 and your problem is your MA(15) for Jan 1st to Jan 14th is not being made? If you actually have the Dec 15 to Dec 31 data, just prepend it to `x`, do the MA calcs, then remove it after. And if you don't have that data, tough. The other interpretation is more unusual: you want an extra set of 15 observations prepended to each MA(15) batch? (If trying to do something between an SMA and an EMA I can almost see why you'd want to do that...) – Darren Cook Dec 09 '13 at 00:10
  • Note to all: changed the discussion examples from MA(15) to MA(5), and edited accordingly. It makes the example weights matrix smaller. – khoxsey Dec 09 '13 at 01:30

1 Answers1

1

Here's one way to do this using endpoints and a for loop. You could still use the which.i=TRUE suggestion in my comment, but integer subsetting is faster.

y <- x*NA                   # pre-allocate result
ep <- endpoints(x,"weeks")  # time points where parameters change

set.seed(1.23456789)
for(i in seq_along(ep)[-(1:2)]) {
  rng1 <- ep[i-1]:ep[i]          # obs to calc weights
  rng2 <- ep[i-2]:ep[i]          # "prime" obs
  wgts <- calc_weights(x[rng1])
  # calc smooth_days on rng2, but only keep rng1 results
  y[rng1] <- smooth_days(x[rng2], wgts)[index(x[rng1])]
}
Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
  • Breaking the rolling-window problem into a "input" window and an "output" window is a great solution. And the `for` loop construction makes it easy to reframe with `foreach`, which is good because I have a *boatload* of these to run. Thanks! – khoxsey Dec 09 '13 at 15:42