93
df = pd.DataFrame([[1,2,3], [10,20,30], [100,200,300]])
df.columns = pd.MultiIndex.from_tuples((("a", "b"), ("a", "c"), ("d", "f")))
df

returns

     a         d
     b    c    f
0    1    2    3
1   10   20   30
2  100  200  300

and

df.columns.levels[1]

returns

Index([u'b', u'c', u'f'], dtype='object')

I want to rename "f" to "e". According to pandas.MultiIndex.rename I run:

df.columns.rename(["b1", "c1", "f1"], level=1)

But it raises

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-110-b171a2b5706c> in <module>()
----> 1 df.columns.rename(["b1", "c1", "f1"], level=1)

C:\Users\USERNAME\AppData\Local\Continuum\Miniconda2\lib\site-packages\pandas\indexes\base.pyc in set_names(self, names, level, inplace)
    994         if level is not None and not is_list_like(level) and is_list_like(
    995                 names):
--> 996             raise TypeError("Names must be a string")
    997 
    998         if not is_list_like(names) and level is None and self.nlevels > 1:

TypeError: Names must be a string

I use Python 2.7.12 |Continuum Analytics, Inc.| (default, Jun 29 2016, 11:07:13) [MSC v.1500 64 bit (AMD64)]' and pandas 0.19.1

Ynjxsjmh
  • 28,441
  • 6
  • 34
  • 52
dinya
  • 1,563
  • 1
  • 16
  • 30
  • 8
    Another thing you can't do is `df.rename(columns={('d', 'f'): ('e', 'g')})`, even though it seems correct. In other words: `.rename()` does not do what one expects, because even though the key for every column is a tuple, the implementation in pandas is by two lists: `df.keys().levels` and `df.keys().labels`. Changing the key for one column may require you to append an element to `levels`, if you don't want to change all occurrences of that name. – Lukas Mar 09 '19 at 19:15

7 Answers7

82

Use set_levels:

In [22]:
df.columns.set_levels(['b1','c1','f1'],level=1,inplace=True)
df

Out[22]:
     a         d
    b1   c1   f1
0    1    2    3
1   10   20   30
2  100  200  300

rename sets the name for the index, it doesn't rename the column names:

In [26]:
df.columns = df.columns.rename("b1", level=1)
df

Out[26]:
      a         d
b1    b    c    f
0     1    2    3
1    10   20   30
2   100  200  300

This is why you get the error

EdChum
  • 376,765
  • 198
  • 813
  • 562
  • 6
    In `python3`, it is `df.index.set_levels(['b1','c1','f1'],level=1,inplace=True)` – gies0r Jun 04 '20 at 17:21
  • Is it possible to access the column names without printing the dataframe? – Antonio Sesto Feb 19 '23 at 08:25
  • @AntonioSesto yes. `df.columns` for the labels, and `df.columns.names` for the level names – fantabolous May 02 '23 at 05:06
  • I can't get this to work. I get `TypeError: set_levels() got an unexpected keyword argument 'inplace'`. The [current docs](https://pandas.pydata.org/docs/reference/api/pandas.MultiIndex.set_levels.html) show no `inplace ` argument. Has something changed? – Bill Aug 16 '23 at 02:13
  • This works: `df.columns = df.columns.set_levels(['b1', 'c1', 'f1'], level=1)` – Bill Aug 16 '23 at 02:14
65

In pandas 0.21.0+ use parameter level=1:

d = dict(zip(df.columns.levels[1], ["b1", "c1", "f1"]))
print (d)
{'c': 'c1', 'b': 'b1', 'f': 'f1'}

df = df.rename(columns=d, level=1)
print (df)
     a         d
    b1   c1   f1
0    1    2    3
1   10   20   30
2  100  200  300
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
40

You can use pandas.DataFrame.rename() directly

Say you have the following dataframe

print(df)

     a         d
     b    c    f
0    1    2    3
1   10   20   30
2  100  200  300
df = df.rename(columns={'f': 'f1', 'd': 'd1'})
print(df)

     a        d1
     b    c   f1
0    1    2    3
1   10   20   30
2  100  200  300

You see, column name mapper doesn't relate with level.

Say you have the following dataframe

     a         d
     b    f    f
0    1    2    3
1   10   20   30
2  100  200  300

If you want to rename the f under a, you can do

df.columns = df.columns.values
df.columns = pd.MultiIndex.from_tuples(df.rename(columns={('a', 'f'): ('a', 'af')}))
# or in one line
df.columns = pd.MultiIndex.from_tuples(df.set_axis(df.columns.values, axis=1)
                                       .rename(columns={('a', 'f'): ('a', 'af')}))
print(df)

     a         d
     b   af    f
0    1    2    3
1   10   20   30
2  100  200  300
Ynjxsjmh
  • 28,441
  • 6
  • 34
  • 52
  • Would you be so kind to explain why `df.columns = df.columns.values` is required in the last case? – Thomas Hilger Jun 07 '22 at 18:50
  • 2
    @ThomasHilger To convert the MultiIndex to a list of tuples as the tuple can be matched in `rename`. Another option is to use [pandas.MultiIndex.to_flat_index](https://pandas.pydata.org/docs/reference/api/pandas.MultiIndex.to_flat_index.html). – Ynjxsjmh Jun 08 '22 at 01:49
  • I expected this to work for renaming `f` under `a`: `df.rename(columns={('a', 'f'): ('a', 'af')})`; why does it fail? – Attila the Fun Sep 05 '22 at 21:05
  • 1
    @AttilatheFun `MultiIndex` is different from tuple. – Ynjxsjmh Sep 06 '22 at 01:39
14

There is also index.set_names (code)

df.index.set_names(["b1", "c1", "f1"], inplace=True)
gies0r
  • 4,723
  • 4
  • 39
  • 50
10

Another thing you can't do is df.rename(columns={('d', 'f'): ('e', 'g')}), even though it seems correct. In other words: .rename() does not do what one expects, <...>

-- Lukas at comment

The "hacky" way is something like this (as far as for pandas 1.0.5)

def rename_columns(df, columns, inplace=False):
    """Rename dataframe columns.

    Parameters
    ----------
    df : pandas.DataFrame
        Dataframe.
    columns : dict-like
        Alternative to specifying axis. If `df.columns` is
        :obj: `pandas.MultiIndex`-object and has a few levels, pass equal-size tuples.

    Returns
    -------
    pandas.DataFrame or None
        Returns dataframe with modifed columns or ``None`` (depends on `inplace` parameter value).
    
    Examples
    --------
    >>> columns = pd.Index([1, 2, 3])
    >>> df = pd.DataFrame([[1, 2, 3], [10, 20, 30]], columns=columns)
    ...     1   2   3
    ... 0   1   2   3
    ... 1  10  20  30
    >>> rename_columns(df, columns={1 : 10})
    ...    10   2   3
    ... 0   1   2   3
    ... 1  10  20  30
    
    MultiIndex
    
    >>> columns = pd.MultiIndex.from_tuples([("A0", "B0", "C0"), ("A1", "B1", "C1"), ("A2", "B2", "")])
    >>> df = pd.DataFrame([[1, 2, 3], [10, 20, 30]], columns=columns)
    >>> df
    ...    A0  A1  A2
    ...    B0  B1  B2
    ...    C0  C1
    ... 0   1   2   3
    ... 1  10  20  30
    >>> rename_columns(df, columns={("A2", "B2", "") : ("A3", "B3", "")})
    ...    A0  A1  A3
    ...    B0  B1  B3
    ...    C0  C1
    ... 0   1   2   3
    ... 1  10  20  30
    """
    columns_new = []
    for col in df.columns.values:
        if col in columns:
            columns_new.append(columns[col])
        else:
            columns_new.append(col)
    columns_new = pd.Index(columns_new, tupleize_cols=True)

    if inplace:
        df.columns = columns_new
    else:
        df_new = df.copy()
        df_new.columns = columns_new
        return df_new

So just

>>> df = pd.DataFrame([[1,2,3], [10,20,30], [100,200,300]])
>>> df.columns = pd.MultiIndex.from_tuples((("a", "b"), ("a", "c"), ("d", "f")))
>>> rename_columns(df, columns={('d', 'f'): ('e', 'g')})
...      a         e
...      b    c    g
... 0    1    2    3
... 1   10   20   30
... 2  100  200  300

What does the pandas-team think about this? Why is this behavior not default?

dinya
  • 1,563
  • 1
  • 16
  • 30
  • This only seems to allow you to change the 2nd level, so if you wanted to change, say `("a", "c")` to `("b", "c")`, this doesn't work. I'm not sure why not, but I have a particular use case that needs this treatment. Any clue? – double0darbo Nov 11 '22 at 02:39
  • I had to go the long way around: `pd.MultiIndex.from_tuples( [("b", "c") if t == ("a", "c") else t for t in pd.MultiIndex.from_tuples(df.columns)])`. – double0darbo Nov 11 '22 at 02:43
5

Another way to do that is with pandas.Series.map and a lambda function as follows

df.columns = df.columns.map(lambda x: (x[0], "e") if x[1] == "f" else x)

[Out]:
     a         d
     b    c    e
0    1    2    3
1   10   20   30
2  100  200  300
Gonçalo Peres
  • 11,752
  • 3
  • 54
  • 83
3

Using dicts to rename tuples

Since multi-index stores values as tuples, and python dicts accept tuples as keys and values, we can replace them using a dict.

mapping_dict = {("d","f"):("d","e")}

# Dictionary allows using tuples as keys and values
def rename_tuple(tuple_, dict_):
    """Replaces tuple if present in tuple dict"""
    if tuple_ in dict_.keys():
        return dict_[tuple_]
    return tuple_

# Rename chosen elements from list of tuples from df.columns
altered_index_list = [rename_tuple(tuple_,mapping_dict) for tuple_ in df.columns.to_list()]

# Update columns with new renamed columns
df.columns = pd.Index(altered_index_list)

Which returns the intended df

     a         d
     b    c    e
0    1    2    3
1   10   20   30
2  100  200  300

Aggregating in a function

This could then be aggregated in a function to simplify things

def rename_multi_index(index,mapper):
    """Renames pandas multi_index"""
    return pd.Index([rename_tuple(tuple_,mapper) for tuple_ in index])

# And now simply do
df.columns = rename_multi_index(df.columns,mapping_dict)