Pandas: drop a level from a multi-level column index?

Question

If I've got a multi-level column index:

>>> cols = pd.MultiIndex.from_tuples([("a", "b"), ("a", "c")])
>>> pd.DataFrame([[1,2], [3,4]], columns=cols)

    a
   ---+--
    b | c
--+---+--
0 | 1 | 2
1 | 3 | 4

How can I drop the "a" level of that index, so I end up with:

    b | c
--+---+--
0 | 1 | 2
1 | 3 | 4

It would be nice to have a DataFrame method that does that for both index and columns. Either of dropping or selecting index levels. — Soerendip, May 24 '18 at 17:56
@Sören Check out https://stackoverflow.com/a/56080234/3198568. `droplevel` works can work on either multilevel indexes or columns through the parameter `axis`. — irene, Apr 23 '20 at 07:35

score 465 · Answer 1 · edited Mar 28 '16 at 16:32

465

You can use MultiIndex.droplevel:

>>> cols = pd.MultiIndex.from_tuples([("a", "b"), ("a", "c")])
>>> df = pd.DataFrame([[1,2], [3,4]], columns=cols)
>>> df
   a   
   b  c
0  1  2
1  3  4

[2 rows x 2 columns]
>>> df.columns = df.columns.droplevel()
>>> df
   b  c
0  1  2
1  3  4

[2 rows x 2 columns]

edited Mar 28 '16 at 16:32

ASGM

11,051
1
32
53

answered Mar 06 '14 at 19:08

DSM

342,061
65
592
494

92

It's probably best to explicitly say which level is being dropped. Levels are 0-indexed beginning from the top. `>>> df.columns = df.columns.droplevel(0)` – Ted Petrou Dec 02 '16 at 02:44
12

If the index you are trying to drop is on the left (row) side and not the top (column) side, you can change "columns" to "index" and use the same method: `>>> df.index = df.index.droplevel(1) ` – Idodo Nov 28 '18 at 12:13
9

In Panda version 0.23.4, `df.columns.droplevel()` is no longer available. – yoonghm Dec 02 '18 at 14:28
14

@yoonghm It is there, you are probably just calling it on columns that don't have a multi-index – Matt Harrison Dec 18 '18 at 14:59
1

I had three levels deep and wanted to drop down to just the middle level. I found that dropping the lowest (level [2]) and then the highest (level [0]) worked best. `>>>df.columns = df.columns.droplevel(2) >>>df.columns = df.columns.droplevel(0)` – Kyle C Feb 05 '19 at 18:36

jxc · Answer 2 · 2019-05-10T15:24:12.290

125

As of Pandas 0.24.0, we can now use DataFrame.droplevel():

cols = pd.MultiIndex.from_tuples([("a", "b"), ("a", "c")])
df = pd.DataFrame([[1,2], [3,4]], columns=cols)

df.droplevel(0, axis=1) 

#   b  c
#0  1  2
#1  3  4

This is very useful if you want to keep your DataFrame method-chain rolling.

edited May 10 '19 at 15:24

answered May 10 '19 at 15:02

jxc

13,553
4
16
34

3

This is the "purest" solution in that a new DataFrame is returned rather than have it modified "in place". – EliadL May 10 '20 at 09:37
5

`df.droplevel(0, axis='columns')` is even more explicit and easy to understand – Guy Jan 19 '21 at 12:37
I will come here forever, because I always forget to set `axis=1`. – igorkf Aug 10 '21 at 14:33

Mint · Answer 3 · 2017-07-25T00:24:33.843

107

Another way to drop the index is to use a list comprehension:

df.columns = [col[1] for col in df.columns]

   b  c
0  1  2
1  3  4

This strategy is also useful if you want to combine the names from both levels like in the example below where the bottom level contains two 'y's:

cols = pd.MultiIndex.from_tuples([("A", "x"), ("A", "y"), ("B", "y")])
df = pd.DataFrame([[1,2, 8 ], [3,4, 9]], columns=cols)

   A     B
   x  y  y
0  1  2  8
1  3  4  9

Dropping the top level would leave two columns with the index 'y'. That can be avoided by joining the names with the list comprehension.

df.columns = ['_'.join(col) for col in df.columns]

    A_x A_y B_y
0   1   2   8
1   3   4   9

That's a problem I had after doing a groupby and it took a while to find this other question that solved it. I adapted that solution to the specific case here.

edited Jul 25 '17 at 00:24

answered Jun 28 '17 at 21:22

Mint

1,928
1
13
12

4

`[col[1] for col in df.columns]` is more directly `df.columns.get_level_values(1)`. – Eric O. Lebigot Aug 08 '18 at 15:37
4

Had a similar need wherein some columns had empty level values. Used the following: `[col[0] if col[1] == '' else col[1] for col in df.columns]` – Logan Oct 07 '18 at 00:30
That's awesome. I was needing an easy way to bind level + columns. Thank you. – igorkf Dec 23 '22 at 23:25

score 54 · Answer 4 · answered Apr 17 '16 at 21:57

54

Another way to do this is to reassign df based on a cross section of df, using the .xs method.

>>> df

    a
    b   c
0   1   2
1   3   4

>>> df = df.xs('a', axis=1, drop_level=True)

    # 'a' : key on which to get cross section
    # axis=1 : get cross section of column
    # drop_level=True : returns cross section without the multilevel index

>>> df

    b   c
0   1   2
1   3   4

answered Apr 17 '16 at 21:57

spacetyper

1,471
18
27

2

This only works whenever there is a single label for an entire column level. – Ted Petrou Nov 03 '17 at 16:23
1

Does not work when you want to drop the second level. – Soerendip Apr 26 '18 at 21:08
1

This is a nice solution if you want to slice and drop for the same level. If you wanted to slice on the second level (say `b`) then drop that level and be left with the first level (`a`), the following would work: `df = df.xs('b', axis=1, level=1, drop_level=True)` – Tiffany G. Wilson Jun 05 '18 at 17:07

score 20 · Answer 5 · answered Nov 23 '18 at 15:20

20

A small trick using sum with level=1(work when level=1 is all unique)

df.sum(level=1,axis=1)
Out[202]: 
   b  c
0  1  2
1  3  4

More common solution get_level_values

df.columns=df.columns.get_level_values(1)
df
Out[206]: 
   b  c
0  1  2
1  3  4

answered Nov 23 '18 at 15:20

BENY

317,841
20
164
234

score 18 · Answer 6 · answered Jun 23 '15 at 00:29

18

You could also achieve that by renaming the columns:

df.columns = ['a', 'b']

This involves a manual step but could be an option especially if you would eventually rename your data frame.

answered Jun 23 '15 at 00:29

sedeh

7,083
6
48
65

This is essentially what Mint's first answer does. Now, there is also no need to specify the list of names (which is generally tedious), as it is given to you by `df.columns.get_level_values(1)`. – Eric O. Lebigot Aug 08 '18 at 15:59

score 8 · Answer 7 · answered Feb 17 '18 at 17:58

8

I have struggled with this problem since I don’t know why my droplevel() function does not work. Work through several and learn that ‘a’ in your table is columns name and ‘b’, ‘c’ are index. Do like this will help

df.columns.name = None
df.reset_index() #make index become label

answered Feb 17 '18 at 17:58

dhFrank

99
1
3

1

This does not reproduce the desired output at all. – Eric O. Lebigot Aug 08 '18 at 16:02
1

Based on the date this was posted, drop level might not have been included in your version of Pandas (it was added to the stable version, 24.0, on January 2019) – LinkBerest Jul 30 '19 at 04:28

score 0 · Answer 8 · answered Mar 21 '23 at 06:00

0

new_columns_cdnr = []
for column in list(df.columns):
    new = [x for x in list(column) if not 'unnamed' in x.lower()]
    new_columns_cdnr.append(new[-1])
df.columns = new_columns_cdnr

answered Mar 21 '23 at 06:00

Amol kale

1

Pandas: drop a level from a multi-level column index?

8 Answers8

Linked

Related