Python: Multiple Statistics per Group

Question

I am trying to do multiple statistics per group. I can do count of each group but I can't figure out how to get the percentage of each group.

Here is what I have:

In my example, I forced the 881 for all rows to calculate the percent values, but I would like to replace 881 with something like count of each final_stage and calculate the percent of each final_stage.

please post a sample df and expected output df as text along with explaination, images cant be copied. — anky, Apr 14 '19 at 07:57
From [ask]: "_DO NOT post images of code, data, error messages, etc. - copy or type the text into the question. Please reserve the use of images for diagrams or demonstrating rendering bugs, things that are impossible to describe accurately via text._" — user2314737, Apr 14 '19 at 08:42

jezrael · Answer 1 · 2019-04-14T08:04:55.223

1

I believe you need specify column after groupby and pass tuples with new columns names with aggregate functions:

df.groupby('final_stage')['d1'].agg([('ctn','size'), ('percent', lambda x: len(x)/ len(df))])

Or:

df1 = df.groupby('final_stage')['d1'].size().reset_index(name='ctn')
df1['percent'] =  df1['ctn'] / len(df)

edited Apr 14 '19 at 08:04

answered Apr 14 '19 at 07:59

jezrael

822,522
95
1,334
1,252

Thanks so much for your prompt response! I am struggling with another groupby statement as shown in the following [link](https://stackoverflow.com/questions/55663359/python-summarizing-aggregating-groups-and-sub-groups-in-dataframe/55663833?noredirect=1#comment98022929_55663833). I greatly appreciate your help :) – user9532692 Apr 14 '19 at 08:04
@user9532692 - added solution – jezrael Apr 14 '19 at 08:52

Python: Multiple Statistics per Group

1 Answers1