44

I have a pandas dataframe which looks like the following:

Name    Missed    Credit    Grade
A       1         3         10
A       1         1         12      
B       2         3         10
B       1         2         20

And my desired output is:

Name    Sum1   Sum2    Average
A       2      4      11
B       3      5      15   

Basically to get the sum of column Credit and Missed and to do average on Grade. What I am doing right now is two groupby on Name and then get sum and average and finally merge the two output dataframes which does not seem to be the best way of doing this. I have also found this on SO which makes sense if I want to work only on one column:

df.groupby('Name')['Credit'].agg(['sum','average'])

But not sure how to do a one-liner for both columns?

ahajib
  • 12,838
  • 29
  • 79
  • 120
  • 1
    https://stackoverflow.com/questions/14529838/apply-multiple-functions-to-multiple-groupby-columns – BENY Feb 21 '18 at 15:49

2 Answers2

83

You need agg by dictionary and then rename columns names:

d = {'Missed':'Sum1', 'Credit':'Sum2','Grade':'Average'}
df=df.groupby('Name').agg({'Missed':'sum', 'Credit':'sum','Grade':'mean'}).rename(columns=d)
print (df)
      Sum1  Sum2  Average
Name                     
A        2     4       11
B        3     5       15

If want also create column from Name:

df = (df.groupby('Name', as_index=False)
       .agg({'Missed':'sum', 'Credit':'sum','Grade':'mean'})
       .rename(columns={'Missed':'Sum1', 'Credit':'Sum2','Grade':'Average'}))
print (df)
  Name  Sum1  Sum2  Average
0    A     2     4       11
1    B     3     5       15

Solution with named aggregations:

df = df.groupby('Name', as_index=False).agg(Sum1=('Missed','sum'), 
                                            Sum2= ('Credit','sum'),
                                            Average=('Grade','mean'))
print (df)
  Name  Sum1  Sum2  Average
0    A     2     4       11
1    B     3     5       15
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • in the last command, is it possible to add a custom function ? – Basile Oct 14 '21 at 11:44
  • 1
    @Basile - sure, use instead `Average=('Grade','mean')` e.g. `Average=('Grade','my _function')` and then `def my_function(x): return x.cumsum().mean()` or `Average=('Grade',lambda x: x.cumsum().mean())` – jezrael Oct 15 '21 at 05:39
3
A = pd.DataFrame.from_dict({'Name':['A','A','B','B'],'Missed':[1,1,2,1],'Credit':[3,1,3,2],'Grades':[10,12,10,20]})

A.groupby('Name').agg({'Missed':'sum','Credit':'sum','Grades':'mean'})
BENY
  • 317,841
  • 20
  • 164
  • 234
ashish trehan
  • 413
  • 1
  • 5
  • 9