I have the following function which takes in a pandas series and gives me cumulative count of the cumulative sum of a target (0 or 1). See below for what the output should transform into (the inputs will always be a binary sequence). The current usage of this function in my code requires me to loop through it several times; I am refactoring my code to, rather than loop through pandas series, to instead use 3 dimensional numpy array inputs. Another motivation is the performance critical nature, and I believe numpy will provide the speed I need. However, none of my attempted solutions give the correct result - I can't get an equivalent of the groupby to work. Can someone help me refactor this code into numpy?
[0,0,0,1,1,1,0,1] --> [0,0,0,1,2,3,0,1]
import pandas as pd
import numpy as np
def calculate_binary_momentum(series, target):
if target == 0:
tmp_target = 1
else:
tmp_target = 0
s = series.groupby((series != tmp_target).cumsum()).cumcount()
return s
a = np.array([[0,0,0,1,1,1,0,1], [1,1,0,0,1,1,1,0]])
b = pd.Series(a[0, :])
c = calculate_binary_momentum(b, 0)
print(c)
in = np.array([[[0,0,0,1,1,1,0,1], [1,1,0,0,1,1,1,0]] , [[0,1,0,1,0,1,0,1], [0,1,1,1,1,1,1,0]]])
out = calculate_binary_momentum_3d_np(in, 0)
out --> [[[0,0,0,1,2,3,0,1], [1,2,0,0,1,2,3,0]], [[0,1,0,1,0,1,0,1], [0,1,2,3,4,5,6,0]]]
This is different from the following because I am looking for a larger dimensional implementation, and I believe this dimensional generalization in this question is non trivial and therefore merits being reopen. Counting consecutive 1's in NumPy array