Suppose you have a simple pandas dataframe with a MultiIndex:
df = pd.DataFrame(1, index=pd.MultiIndex.from_tuples([('one', 'elem1'), ('one', 'elem2'), ('two', 'elem1'), ('two', 'elem2')]),
columns=['col1', 'col2'])
Printed as a table:
col1 col2
one elem1 1 1
elem2 1 1
two elem1 1 1
elem2 1 1
Question: How do you add a "Total" row to that Dataframe?
Expected output:
col1 col2
one elem1 1.0 1.0
elem2 1.0 1.0
two elem1 1.0 1.0
elem2 1.0 1.0
Total 4.0 4.0
First attempt: Naive implementation
If I am just ignoring the MultiIndex and follow the standard way
df.loc['Total'] = df.sum()
Output:
col1 col2
(one, elem1) 1 1
(one, elem2) 1 1
(two, elem1) 1 1
(two, elem2) 1 1
Total 4 4
It seems to be correct, but the MultiIndex is transformed to Index([('one', 'elem1'), ('one', 'elem2'), ('two', 'elem1'), ('two', 'elem2'), 'Total'], dtype='object')
Second attempt: Be explicit
df.loc['Total', :] = df.sum()
or (being frustrated and changing the axis just out of spite)
df.loc['Total', :] = df.sum(axis=1)
Output (the same for both calls):
col1 col2
one elem1 1.0 1.0
elem2 1.0 1.0
two elem1 1.0 1.0
elem2 1.0 1.0
Total NaN NaN
The MultiIndex is not transformed, but the Total is wrong (NaN != 4).