import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(0,10,size=(10, 3)),
columns=['price', 'created_year', 'price_per_cm'],
index=range(1,11))
>>> df
price created_year price_per_cm artist
1 9 5 4 degas
2 4 0 8 degas
3 2 5 1 renoir
4 0 0 1 picasso
5 9 0 7 renoir
6 5 0 1 degas
7 6 5 8 picasso
8 9 5 3 picasso
9 0 9 7 degas
10 0 5 9 picasso
I want to group by artist and apply different functions to some columns, i.e. mean()
to 'price'
and max()
to 'created_year'
. This is how I achieved this:
s1 = df.groupby(['artist'])['price'].mean()
s2 = df.groupby(['artist'])['created_year'].max()
df2 = pd.concat([s1, s2], axis=1)
price created_year
>>> df2
price created_year
artist
degas 4.50 9
picasso 3.75 5
renoir 5.50 5
Is there a more direct way to get to this point instead of generating two series and concatenating them again to a dataframe?