I'm having a large Pandas data frame and I want to aggregate the columns differently. I have 24 columns (hours of the day), which I would like to sum, and for all others just take the maximum.
I know that I can write manually the required conditions like this:
df_agg = df.groupby('user_id').agg({'hour_0':'sum',
'hour_1':'sum',
.
.
'hour_24':'sum',
'all other columns': 'max'}
)
but I was wondering whether an elegant solution exists on the lines:
df_agg = df.groupby('user_id').agg({'hour_*':'sum',
'all other columns != hour_*': 'max'}