My intended goal was to perform a simple groupby
, then store the group averages as a new column using .transform('mean')
. Then things got complicated. The catch is that what I really want is an average of n-1 values, where 'n' is the number of rows belonging to each group. Example data, where the column RESULT is my desired output:
import pandas as pd
list_of_tuples = [('A', 3, 4.5),
('A', 2, 4.75),
('A', 5, 4),
('A', 4, 4.25),
('A', 7, 3.5),
('B', 6, 6.75),
('B', 9, 6),
('B', 8, 6.25),
('B', 4, 7.25),
('B', 6, 6.75)]
df = pd.DataFrame.from_records(data=list_of_tuples, columns=['ID', 'VALUE', 'RESULT'])
>>> df
ID VALUE RESULT
0 A 3 4.50
1 A 2 4.75
2 A 5 4.00
3 A 4 4.25
4 A 7 3.50
5 B 6 6.75
6 B 9 6.00
7 B 8 6.25
8 B 4 7.25
9 B 6 6.75
You can see that in the first row the value of RESULT is the average of [2, 5, 4, 7], which is 4.5. Likewise, the value of RESULT for the last row is the average of [6, 9, 8, 4], which is 6.75.
So for each row the value of RESULT should be the group average (grouping on ID) of VALUE excluding the number in VALUE for that particular row.