I am trying to look for a method to find a Player's mean score for the "Last Season"
(Previous Year) and add it in a new column in the original dataframe df
.
I have coded a formula to get a Player's mean score for the current year, excluding the current row, which is as follows:
df['Season Avg'] = df.groupby([df['Player'], df['DateTime'].dt.year])['Score']
.apply(lambda x: x.shift(1).expanding().mean())
However, despite my best attempt at using the shift
function, I can not quite work out how to calculate the previous years mean ("Last Season Avg"
) directly into a new column.
The dataframe is set out as follows:
Player | DateTime | Score | Season Avg |
---|---|---|---|
PlayerB | 2020-MM-DD HH:MM:SS | 40 | NaN |
PlayerA | 2020-MM-DD HH:MM:SS | 50 | NaN |
PlayerA | 2021-MM-DD HH:MM:SS | 100 | NaN |
PlayerB | 2021-MM-DD HH:MM:SS | 200 | NaN |
PlayerA | 2021-MM-DD HH:MM:SS | 160 | 100 |
PlayerB | 2021-MM-DD HH:MM:SS | 140 | 200 |
PlayerB | 2021-MM-DD HH:MM:SS | 160 | 170 |
PlayerA | 2021-MM-DD HH:MM:SS | 200 | 130 |
The new ideal dataframe that I would like:
Player | DateTime | Score | Season Avg | Last Season Avg |
---|---|---|---|---|
PlayerB | 2020-MM-DD HH:MM:SS | 40 | NaN | NaN |
PlayerA | 2020-MM-DD HH:MM:SS | 50 | NaN | NaN |
PlayerA | 2021-MM-DD HH:MM:SS | 100 | NaN | 50 |
PlayerB | 2021-MM-DD HH:MM:SS | 200 | NaN | 40 |
PlayerA | 2021-MM-DD HH:MM:SS | 160 | 100 | 50 |
PlayerB | 2021-MM-DD HH:MM:SS | 140 | 200 | 40 |
PlayerB | 2021-MM-DD HH:MM:SS | 160 | 170 | 40 |
PlayerA | 2021-MM-DD HH:MM:SS | 200 | 130 | 50 |