0

I have the following function which takes in a pandas series and gives me cumulative count of the cumulative sum of a target (0 or 1). See below for what the output should transform into (the inputs will always be a binary sequence). The current usage of this function in my code requires me to loop through it several times; I am refactoring my code to, rather than loop through pandas series, to instead use 3 dimensional numpy array inputs. Another motivation is the performance critical nature, and I believe numpy will provide the speed I need. However, none of my attempted solutions give the correct result - I can't get an equivalent of the groupby to work. Can someone help me refactor this code into numpy?

[0,0,0,1,1,1,0,1] --> [0,0,0,1,2,3,0,1]
import pandas as pd
import numpy as np
def calculate_binary_momentum(series, target):
    if target == 0:
        tmp_target = 1
    else:
        tmp_target = 0
    s = series.groupby((series != tmp_target).cumsum()).cumcount()
    return s

a = np.array([[0,0,0,1,1,1,0,1], [1,1,0,0,1,1,1,0]])
b = pd.Series(a[0, :])
c = calculate_binary_momentum(b, 0)
print(c)
in = np.array([[[0,0,0,1,1,1,0,1], [1,1,0,0,1,1,1,0]] , [[0,1,0,1,0,1,0,1], [0,1,1,1,1,1,1,0]]])
out = calculate_binary_momentum_3d_np(in, 0)
out --> [[[0,0,0,1,2,3,0,1], [1,2,0,0,1,2,3,0]], [[0,1,0,1,0,1,0,1], [0,1,2,3,4,5,6,0]]] 

This is different from the following because I am looking for a larger dimensional implementation, and I believe this dimensional generalization in this question is non trivial and therefore merits being reopen. Counting consecutive 1's in NumPy array

frankL
  • 45
  • 6
  • Can you explain why the desired output is the correct output based on the input? – jared Jul 16 '23 at 19:53
  • @jared Yes, I actually had a mistake and edited it. I'll explain each operation: 1) target=0 so tmp_target=1 2) series != 1 for the first vector gives [T, T, T, F, F, F, T, F] 3) the cumulative sum of this is [1,2,3,3,3,3, 4, 4] 4) when we do series.groupby(cumsum), we get a prettydict: {1: [0], 2: [1], 3: [2, 3, 4, 5], 4: [6, 7]}, which is basically a dictionary of lists with key being cumsum-quantity and the value being the consecutive list of indexes. 5) when we take the cumulative count of this, we get [0,0,0,1,2,3,0,1]. I'll update it with code you can simply copy/paste and run – frankL Jul 16 '23 at 20:18

0 Answers0