For example, here is a DataFrame:
df = pd.DataFrame({'year': ['2019', '2019', '2019', '2019', '2020', '2020', '2020'],
'key': ['a', 'a', 'b', 'c', 'd', 'e', 'f'],
'val': [3, 4, 3, 5, 6, 1, 2]})
It looks like
year key val
0 2019 a 3
1 2019 a 4
2 2019 b 3
3 2019 c 5
4 2020 d 6
5 2020 e 1
6 2020 f 2
What I want to obtain is
year key mean_except_current_key
2019 a 4
b 4
c 3.33
2020 d 1.5
e 4
f 3.5
That is, group df
by year
and key
, and mean_except_current_key
is defined as mean of val
over year
except all rows with the same key
as current row.
I hope I have made this problem clear. But I can't figure out it. And I have found this question. However, it is different from mine.
Thanks for any help.