I have a datafame like the following:
import pandas as pd
df = pd.DataFrame({
'A': [1, 1, 1, 2, 2, 2],
'B': [1, 2, 3, 4, 5, 6],
'C': [4, 5, 6, 7, 8, 9],
})
Now I want to group and aggregate with two values being produced per group. The result should be similar to the following:
expected = df.groupby('A').agg([min, max])
# B C
# min max min max
# A
# 1 1 3 4 6
# 2 4 6 7 9
However, in my case, instead of two distinct functions min
and max
, I have one function that computes these two values at once:
def minmax(x):
"""This function promises to compute the min and max in one go."""
return min(x), max(x)
Now my question is, how can I use this one function to produce two aggregation values per group?
It's kind of related to this answer but I couldn't figure out how to do it. The best I could come up with is using a doubly-nested apply
however this is not very elegant and also it produces the multi-index on the rows rather than on the columns:
result = df.groupby('A').apply(
lambda g: g.drop(columns='A').apply(
lambda h: pd.Series(dict(zip(['min', 'max'], minmax(h))))
)
)
# B C
# A
# 1 min 1 4
# max 3 6
# 2 min 4 7
# max 6 9