how to apply a user defined function column wise on grouped data in pandas

Question

How to apply a user defined function column wise on grouped data in pandas. The user defined function returns a series of fixed shape.

def getStats(col):
names = ['mean', 'std']
return pd.Series([np.mean(col), np.std(col)], index = names, name = col.name)

df = pd.DataFrame({'city':['c1','c2','c1','c2'],
               'age':[10,20,30,40],
               'sal':[1000,2000,3000,4000]})

grp_data = df.groupby('city')
grp_data.apply(getStats)

I have tried above snippet. But I am not getting the result in expected format.

city| level| age | sal

c1 | mean | x | y

c2 | std | x1 | y1

Could you pls help on this.

the expected output is mentioned in the question above. a dataframe format. with city, level, age,sal as columns — Rohit A, Jun 08 '20 at 11:46

jezrael · Accepted Answer · 2020-06-08T12:55:16.603

I think custom function here is not necessary, rather aggregate by GroupBy.agg with list of aggregate functions and reshape by DataFrame.stack, last DataFrame.rename_axis is for city and level labels:

df = df.groupby('city').agg([np.mean,np.std]).stack().rename_axis(['city','level'])
print (df)
                  age          sal
city level                        
c1   mean   20.000000  2000.000000
     std    14.142136  1414.213562
c2   mean   30.000000  3000.000000
     std    14.142136  1414.213562

def q(c):
    def f1(x):
        return x.quantile(c)
    f1.__name__ = f'q{c}'
    return f1

df = (df.groupby('city')
        .agg([np.mean,np.std, q(0.25), q(0.75)])
        .stack()
        .rename_axis(['city','level']))

print (df)
                  age          sal
city level                        
c1   mean   20.000000  2000.000000
     std    14.142136  1414.213562
     q0.25  15.000000  1500.000000
     q0.75  25.000000  2500.000000
c2   mean   30.000000  3000.000000
     std    14.142136  1414.213562
     q0.25  25.000000  2500.000000
     q0.75  35.000000  3500.000000

Hi Jezrael, how can I add to get 25th, 75th percentile values also in the aggregate list ? — Rohit A, Jun 08 '20 at 12:40
@RohitA - It is possible, but modified another answer is simplier ;), my answer was edited. — jezrael, Jun 08 '20 at 12:55

how to apply a user defined function column wise on grouped data in pandas

city| level| age | sal

1 Answers1

Linked