I have a problem in python. This is my sample data
col1 col2 desired
0 a 1 2.50
1 a 2 2.00
2 a 3 1.50
3 b 2 3.00
4 b 3 2.00
5 c 3 1.67
6 c 1 2.33
7 c 2 2.00
8 c 2 2.00
the input are df['col1']
and df['col2']
. I want to use these two columns to produce the desired outcome in df['desired']
.
The idea is, that I want to group by col1
and calculate the average value of col2
. The only tweak here, though, is that I want to exclude the current row from the average value calculation.
So for row 0, I am grouping by df['col1'] == 'a'
, but only use row 1 and 2 to calculate the average. For row 1, I also group by df['col1'] == 'a'
, but I only use row 1 and 3. And so forth.
The only thing I can think of, is to create a custom function for .transform()
that will take as input the series coming in from the grouped object, but I am not sure how to approach it. Ideally, I am looking for a simpler (pandas?) method to achieve this.