I recently used groupby()
and np.cumsum()
to cumulatively count all the games the winner of a current tennis match won within a given tournament. I used this code to get the column winner_tourney_games_cumulative in the first table:
df.groupby(['tourney_name', 'year', 'winner_name'])['winner_games_played'].cumsum()
dataset of tennis matches for one tournament:
winner_name | loser_name | tourney_name | round | year | winner_tourney_games_cumulative | winner_games_played | loser_games_played |
---|---|---|---|---|---|---|---|
Roger Federer | Sumit Nagal | US Open | R128 | 2019 | 22 | 22 | 13 |
Roger Federer | Damir Dzumhur | US Open | R64 | 2019 | 43 | 21 | 15 |
Roger Federer | Daniel Evans | US Open | R32 | 2019 | 61 | 18 | 5 |
Roger Federer | David Goffin | US Open | R16 | 2019 | 79 | 18 | 4 |
Grigor Dimitrov | Roger Federer | US Open | QF | 2019 | 87 | 24 | 22 |
This is relatively straightforward for the winners, but not so much for the losers. Given that the loser in the current match made it past the first round of the tournament, this means they have won at least one match before, and hence was the winner of a previous match. This complicates things since the column for winners and losers are separate in the data. So, I believe that the short code with np.cumsum()
I wrote (shown above) won't work for the losers.
Can anyone help me on how to create something similar for the losers as I did for the winners (the column winner_tourney_games_cumulative)? What I would like is the cumulative sum of the number of games the loser has won in previous matches of this tournament.
I am envisioning to look something like this for the losers (you can just ignore the x's below):
winner_name | loser_name | loser_tourney_games_cumulative |
---|---|---|
Roger Federer | Sumit Nagal | X |
Roger Federer | Damir Dzumhur | X |
Roger Federer | Daniel Evans | X |
Roger Federer | David Goffin | X |
Grigor Dimitrov | Roger Federer | 101 |
For Roger Federer in the match he lost, it should be 79 (the cumulative games he won in first table from past matches) + 22 (how many games he won in this match) = 101.
Thanks so much in advance!