I have a dataframe with all of a fighter's fights, the fight number (i.e. if it is their first, second, etc.), and whether or not they won the fight. I would like to calculate the number of consecutive wins a fighter had gotten before their current fight (i.e. not including if they won the current fight). I am currently working with Python 3.7 in Spyder.
Suppose we have the following dataframe, where win = 1 if the fighter won the fight:
df = pd.DataFrame({'fighter' : ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'C', 'C'],
'fight_number' : ['1', '2', '3', '4', '1', '2', '3', '1', '2'],
'win' : [0, 0, 1, 1, 1, 1, 0, 1, 1]})
fighter fight_number win
0 A 1 0
1 A 2 0
2 A 3 1
3 A 4 1
4 B 1 1
5 B 2 1
6 B 3 0
7 C 1 1
8 C 2 1
I know that to calculate win streaks across all rows, I can implement the solution proposed here with:
grouper = (df.win != df.win.shift()).cumsum()
df['streak'] = df.groupby(grouper).cumsum()
which produces:
fighter fight_number win streak
0 A 1 0 0
1 A 2 0 0
2 A 3 1 1
3 A 4 1 2
4 B 1 1 3
5 B 2 1 4
6 B 3 0 0
7 C 1 1 1
8 C 2 1 2
But what I need is to apply this approach to subgroups of the dataframe (i.e. to each fighter) and to not include the outcome of the current fight in the count of the streak. So, I am basically trying to have the current win streak of the fighter when they enter the fight.
The target output in this example would therefore be:
fighter fight_number win streak
0 A 1 0 0
1 A 2 0 0
2 A 3 1 0
3 A 4 1 1
4 B 1 1 0
5 B 2 1 1
6 B 3 0 2
7 C 1 1 0
8 C 2 1 1
I appreciate any advice I can get on this, as I am pretty new to Python.