How can I calculate a column showing the % of total in a groupby
?
One way to do it is to calculate it manually after the groupby
, as in the last line of this example:
import numpy as np
import pandas as pd
df= pd.DataFrame(np.random.randint(5,8,(10,4)), columns=['a','b','c','d'])
g = df.groupby('a').agg({'b':['sum','mean'], 'c':['sum'], 'd':['sum']})
g.columns = g.columns.map('_'.join)
g['b %']=g['b_sum']/g['b_sum'].sum()
However, in my real data I have many more columns, and I'd need the % right after the sum
, so with this approach I'd have to manually change the order of the columns.
Is there a more direct way of doing it so that the % is the column right after the sum
? Note that I need the agg()
, or something equivalent, because in all my groupby
s I apply different aggregate functions to different columns (e.g. sum
and avg
of x, but only the min of y, etc.).