I have a signal position indicator vector consisting out of -1s and 1s. In addition, I have volume data which I want to sum based on the value of Signal. The basic data table looks like this:
df <- cbind(Signal, Volume)
head(df, 20)
Signal Volume
2016-01-04 NA 37912403
2016-01-05 -1 23258238
2016-01-06 -1 25096183
2016-01-07 -1 45172906
2016-01-08 -1 35402298
2016-01-11 -1 29932385
2016-01-12 -1 28395390
2016-01-13 -1 33410553
2016-01-14 -1 48658623
2016-01-15 1 46132781
2016-01-19 1 30998256
2016-01-20 -1 59051429
2016-01-21 1 30518939
2016-01-22 1 30495387
2016-01-25 1 32482015
2016-01-26 -1 26877080
2016-01-27 -1 58699359
2016-01-28 1 107475327
2016-01-29 1 62739548
2016-02-01 1 46132726
What I would like to achieve is (without using a for loop) is to produce a vector of cum Volume, which would be reset every time the Signal changes. In addition, the values of volume should be multiplied by the value of the Signal, i.e. when Signal is -1 it should add -Volume to the current cum Volume. Based on a similar questions on SO I have tried
ave(df$a, cumsum(c(F, diff(sign(diff(df$a))) != 0)*df$Volume), FUN=seq_along)
which produces the right grouping of Signal, but the Volume is not included for some reason. Without the reset the solution is fairly straightforward (posted on SO)
require(data.table)
DT <- data.table(dt)
DT[, Cum.Sum := cumsum(Volume), by=Signal]
Does anyone know a dplyr or data.table kind of solution for both resetting and conditioning a cum sum? Thanks.