I have a DataFrame
overview
(shortened for this question).
>>> import pandas as pd
>>> import numpy as np
>>>
>>> index = pd.Index(['Abbott PLC', 'Abbott, Rogahn and Bednar'], dtype='object', name='Account Name')
>>> columns = pd.MultiIndex(levels=[['total', 'quantity'], ['count', 'sum']], labels=[[0, 0, 1], [1, 0, 1]])
>>> values = np.array([[755.44, 1. , 19. ], [615.6 , 1. , 18. ]])
>>>
>>> overview = pd.DataFrame(values, columns=columns, index=index)
>>> overview
total quantity
sum count sum
Account Name
Abbott PLC 755.44 1.0 19.0
Abbott, Rogahn and Bednar 615.60 1.0 18.0
The column names are weird because in my actual code, this result is derived from a DataFrame
df
with the following grouping operation.
aggregators = {'total': ['sum', 'count'], 'quantity': 'sum'}
overview = df.groupby('Account Name')['total', 'quantity'].agg(aggregators)
I want to rename the columns of overview
, the desired result looks like this:
gross checkouts items
Account Name
Abbott PLC 755.44 1.0 19.0
Abbott, Rogahn and Bednar 615.60 1.0 18.0
I cannot simply use overview.columns = ['gross', 'checkouts', 'items']
or the answers from this similar question because after using agg
the columns are in arbitrary order. (Applying rename
also seems tricky because of the duplicate name 'sum'
.)
Currently, I am working around the problem by using an OrderedDict
for aggregators
, thus overview
having a deterministic column oder. But assuming the creation of overview
cannot be fixed upstream, how would I arrive at my desired result elegantly?