2

I need to reset the cumulative summation whenever there is zero in the vector using R. E.g. Input c(0,1,0,0,0,1,1,1,0,1,1) I want the output as 0,1,0,0,0,1,2,3,0,1,2. I checked multiple answers which answer resetting of sequence using functions, but those solutions did not work here.

Numeric sequence with condition, Create counter with multiple variables, Comp. Efficent way of resetting sequence if condition met ( R ) are some which I referred. I tried different combinations of cumsum, ave and getanID but can't seem to get the output I want.

Kinjal
  • 154
  • 2
  • 8

3 Answers3

3

perhaps something like this:

vec <- c(0,1,0,0,0,1,1,1,0,1,1)
library(data.table)
as.numeric(unlist(by(vec, rleid(vec), cumsum))) #or as in Maurits Evers answer `unname`, or `unlist(..., use.names = F)` instead of `as.numeric`
#output
 0 1 0 0 0 1 2 3 0 1 2

rleid makes a run-length type id column:

rleid(vec)
#output
1 2 3 3 3 4 4 4 5 6 6

this is then used as a grouping variable

EDIT: as per suggestion of @digEmAll:

Note that this works only if vec contains only 0 and 1. To make it more generic you should use

rleid(vec > 0)

missuse
  • 19,056
  • 3
  • 25
  • 47
3

Here is a base R solution using split:

v <- c(0,1,0,0,0,1,1,1,0,1,1)
unname(unlist(lapply(split(v, cumsum(c(0, diff(v) != 0))), cumsum)))
# [1] 0 1 0 0 0 1 2 3 0 1 2

The idea is to split the vector into chunks based on 0s, and then calculate the cumsum per chunk.

Instead of unname(unlist(...)) you can also use unlist(..., use.names = F).

Maurits Evers
  • 49,617
  • 4
  • 47
  • 68
2

Another possible solution using ave and rle:

ave(v,inverse.rle(with(rle(v>0),list(values=seq_along(values),lengths=lengths))),FUN=cumsum)
>
[1] 0 1 0 0 0 1 2 3 0 1 2

Note that :

inverse.rle(with(rle(v>0),list(values=seq_along(values),lengths=lengths))

is equal to :

data.table::rleid(v>0)

and thay return the "ids" of the batches of consecutive zero/non-zero elements of v :

[1] 1 2 3 3 3 4 4 4 5 6 6
digEmAll
  • 56,430
  • 9
  • 115
  • 140