I have a dataset of football matches as a Pandas dataframe in the form below:
Home Away Result HG AG
0 Liverpool Norwich City Liverpool 4 1
1 West Ham Man City Man City 0 5
2 AFC Bournemouth Sheffield United Draw 1 1
3 Burnley Southampton Burnley 3 0
4 Crystal Palace Everton Draw 0 0
I'd like to track results by team in a dictionary of lists:
{'Liverpool': [W,W, ... ,W], 'West Ham': [W, D, L, ... ], ... }
etc.
My approach naturally is to iterate over all the rows with conditionals (below is psuedo-code):
if df.Result == 'Draw':
dict[df[Home]].append('D')
dict[df[Away]].append('D')
elif df.Home == df.Result:
dict[df[Home]].append('W')
dict[df[Away]].append('L')
else:
dict[df[Home]].append('L')
dict[df[Away]].append('W')
I believe I can do this using df.iterrows()
but I know this isn't the desired approach in general with Pandas. Is there a way I can do this kind of operation while taking advantage of the power of Pandas DataFrames? I have seen that df.Home == df.Result
returns a series of True/False values but I don't know how to utilise this or extend it to the multiple conditions described above.
I have also seen np.where
and np.select
from this answer, but I don't see it being applicable to this situation where I want to do something based off the outcome of a conditional on each row, using an entry in the row as a key.
It feels as though iteration is the only solution here, I'm sure Pandas would support something like this but I have no idea how to search for it.