If I have a pandas data frame df, the following three methods to calculate the mean values of the columns will give the same result:
import numpy as np
df.mean(axis = 0)
df.apply(np.mean)
df.aggregate(np.mean)
But what about if I create some groups, and use these methods in a similar way:
groups = df.groupby(by = 'A')
groups.mean()
groups.apply(np.mean)
groups.aggregate(np.mean)
...in this example .mean and .aggregate give the same result, but .apply does not. With .apply the grouped column 'A' will be returned, both as index and column (Which was not what I expected or wanted, when I came across this issue)
This behaviour seems inconsistent to me, or am I missing some fundamental difference between these 3 methods?