I'm looking a vectorized way to do a cummulative sum that resets everytime a 0 occurs. For instance say we have an array ar = np.array([0,1,0,1,1,0,1,0])
. The output i want is then np.array([0,1,0,1,2,0,1,0])
.
I have the following implementations that works but they are not completely vectorized.
Method 1:
s = pd.Series(ar)
data = s.groupby(s.eq(0).cumsum()).cumsum().tolist()
Method 2:
def intervaled_cumsum(ar):
split = np.array((np.split(ar, np.where(ar<1)[0])))[1:]
sizes = np.array([len(i) for i in split])
out = ar.copy()
arc = ar.cumsum()
idx = sizes.cumsum()
out[idx[0]] = ar[idx[0]] - arc[idx[0]-1]
out[idx[1:-1]] = ar[idx[1:-1]] - np.diff(arc[idx[:-1]-1])
return out.cumsum()
How might i do this in python using any library really, could be something other than numpy?
shout out to the answers on this thread Multiple cumulative sum within a numpy array