0

Let's say we have a data set that reflects a series of binary signals--either "1" or "0" over some time period. The goal is to measure the consecutive instances of 1, and sum the total. So, for instance, if 1 occurred four times in a row, the cumulative signal would be 4 for the fourth time period. If the next period--the fifth period--was 0, the counter resets until/if the 1 signal returns. I'm looking for a way to record the equivalent of rolling frequencies of consecutive instances of 1s. I've tried for loops and apply(), but can't seem to find the right mix. Any suggestions would be greatly appreciated. As a test, let's create some fake data. Any ideas on how to use R to solve this question? Thanks!

 library(xts)
 set.seed(5)
 x <- xts(cbind(sample(0:1,50,replace=T)), order=Sys.Date()-100 + 1:50)

 > head(x)
       [,1]
 2014-08-26    0
 2014-08-27    1
 2014-08-28    1
 2014-08-29    0
 2014-08-30    0
 2014-08-31    1
Oliver Keyes
  • 3,294
  • 2
  • 14
  • 23
James Picerno
  • 472
  • 4
  • 16
  • 2
    take a look at `?rle` function. – talat Dec 03 '14 at 20:27
  • Yes, rle() works, but it doesn't record the rolling record by time period. Or am I missing something with rle? In any case, a good (partial) solution. Thanks. – James Picerno Dec 03 '14 at 20:30
  • 2
    Could you add your expected output? This should be quite doable with some judicious `rle`-fu. – Stephan Kolassa Dec 03 '14 at 20:38
  • 1
    S.K. is exactly correct. You can pull almost any sort of association out of the combination of `$lengths` and `$values` that `rle` returns. – Carl Witthoft Dec 03 '14 at 20:41
  • This is for investment analysis on a stock/mutual funds. I'm looking for the equivalent of rollapply() that, for instance, records when prices are above/below a moving average. The glitch here is that I'm trying to sum the instances of consecutive 1s and record the date for when those instances peak. In short, the dates matter. – James Picerno Dec 03 '14 at 20:46

1 Answers1

1

Another approach using ave:

cbind(x, newCol = ave(x, cumsum(c(FALSE, (as.logical(diff(x)))[-1])), 
                      FUN = function(i) seq_along(i) * i))

The result:

           ..1 newCol
2014-08-26   0      0
2014-08-27   1      1
2014-08-28   1      2
2014-08-29   0      0
2014-08-30   0      0
2014-08-31   1      1
2014-09-01   1      2
2014-09-02   1      3
2014-09-03   1      4
2014-09-04   0      0
.
.
.
Sven Hohenstein
  • 80,497
  • 17
  • 145
  • 168