I thought I knew how to do this but I'm pulling my hair out over it. I'm trying to use a function to create a new column. The function looks at the value of the win column in the current row and needs to compare it to the previous number in the win column as the if statements lay out below. The win column will only ever be 0 or 1.
import pandas as pd
data = pd.DataFrame({'win': [0, 0, 1, 1, 1, 0, 1]})
print (data)
win
0 0
1 0
2 1
3 1
4 1
5 0
6 1
def streak(row):
win_current_row = row['win']
win_row_above = row['win'].shift(-1)
streak_row_above = row['streak'].shift(-1)
if (win_row_above == 0) & (win_current_row == 0):
return 0
elif (win_row_above == 0) & (win_current_row ==1):
return 1
elif (win_row_above ==1) & (win_current_row == 1):
return streak_row_above + 1
else:
return 0
data['streak'] = data.apply(streak, axis=1)
All this ends with this error:
AttributeError: ("'numpy.int64' object has no attribute 'shift'", 'occurred at index 0')
In other examples I see functions that are referring to df['column'].shift(1)
so I'm confused why I can't seem to do it in this instance.
The output I'm trying to get too is:
result = pd.DataFrame({'win': [0, 0, 1, 1, 1, 0, 1], 'streak': ['NaN', 0 , 1, 2, 3, 0, 1]})
print(result)
win streak
0 0 NaN
1 0 0
2 1 1
3 1 2
4 1 3
5 0 0
6 1 1
Thanks for helping to get me unstuck.