How to divide elements in pandas column by grouped sum

Question

I have a dataframe looking like this, df1:

col1   col2
 A      2
 A      3
 A      4
 B      4
 B      8

Now, I want to calculate the percentage of the value in col2 per unique item in col1. Hence I want the result to be:

col1   col2
 A      0.22
 A      0.33
 A      0.33
 B      0.33
 B      0.67

Hence, the sum of col2 has to be 1 for the unique elements in col1. Does anyone know how to do this without using for loops?

score 3 · Accepted Answer · answered Jun 08 '21 at 07:38

Use GroupBy.transform for sums to Series and divide by original column col2:

df['col2'] /= df.groupby('col1')['col2'].transform('sum')
#working like
#df['col2'] = df['col2'] / df.groupby('col1')['col2'].transform('sum')
print (df)
  col1      col2
0    A  0.222222
1    A  0.333333
2    A  0.444444
3    B  0.333333
4    B  0.666667

score 0 · Answer 2 · answered Jun 08 '21 at 09:45

Another way, but limiting, since it sets an index (you would need to reset the index) and possibly not as efficient as using the transform :

df = df.set_index('col1')

df.div(df.sum(level=0)).reset_index()

  col1      col2
0    A  0.222222
1    A  0.333333
2    A  0.444444
3    B  0.333333
4    B  0.666667

How to divide elements in pandas column by grouped sum

2 Answers2