I am trying to calculate positive and negative and no streaks using numpy exclusively. The issue i'm having to figuring out the groupby component of the equation which all my research has lead to believe I need. I found a pandas response here Pythonic way to calculate streaks in pandas dataframe
I've been able to convert all but the groupby piece. Any help is appreciated
here is the pandas code i would like to reproduce. The only non numpy equivalent is groupby. I also created my own shift function in numpy.
Pandas version:
def streaks(df, col):
sign = np.sign(df[col])
s = sign.groupby((sign!=sign.shift()).cumsum()).cumsum()
return df.assign(u_streak=s.where(s>0, 0.0),
d_streak=s.where(s<0,0.0).abs())
My partial numpy version:
arr = np.array([0.2,0.1,0.1,0.0,-0.2,-0.1,0.0])
sign = np.sign(arr)
s = np.not_equal(sign, shift(sign))
# now I need to groupby and then sum and sum again
np.cumsum(groupby(np.cumsum(s)))
The expected result should be:
array([1.,2.,3.,0.,-1.,-2.,0.])