4

How can can simply rename a MultiIndex column from a pandas DataFrame, using the rename() function?

Let's look at an example and create such a DataFrame:

import pandas
df = pandas.DataFrame({'A': [1, 1, 1, 2, 2], 'B': range(5), 'C': range(5)})
df = df.groupby("A").agg({"B":["min","max"],"C":"mean"})
print(df)

    B        C
  min max mean
A             
1   0   2  1.0
2   3   4  3.5

I am able to select a given MultiIndex column by using a tuple for its name:

print(df[("B","min")])

A
1    0
2    3
Name: (B, min), dtype: int64

However, when using the same tuple naming with the rename() function, it does not seem it is accepted:

df.rename(columns={("B","min"):"renamed"},inplace=True)
print(df)
    B        C
  min max mean
A             
1   0   2  1.0
2   3   4  3.5

Any idea how rename() should be called to deal with Multi-Index columns?

PS : I am aware of the other options to flatten the column names before, but this prevents one-liners so I am looking for a cleaner solution (see my previous question)

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
L. Dubois
  • 41
  • 4
  • It looks like https://stackoverflow.com/questions/41221079/rename-multiindex-columns-in-pandas answers this question – SarahD Jul 04 '19 at 09:41

1 Answers1

0

This doesn't answer the question as worded, but it will work for your given example (assuming you want them all renamed with no MultiIndex):

import pandas as pd
df = pd.DataFrame({'A': [1, 1, 1, 2, 2], 'B': range(5), 'C': range(5)})
df = df.groupby("A").agg(
    renamed=('B', 'min'),
    B_max=('B', 'max'),
    C_mean=('C', 'mean'),
)
print(df)

   renamed  B_max  C_mean
A                        
1        0      2     1.0
2        3      4     3.5

For more info, you can see the pandas docs and some related other questions.

ThomasH
  • 127
  • 1
  • 8