4

If I've got a multi-level column and multi-level index for a dataframe

column_level1               a1      | a2
                           ----+----|----+----
column_level2               b1 | b2 | b3 | b4

index1 | index2 | index3
-------+--------+--------+-----+----+----+-----
   0   |   c1   |   d1   |  1  |  2 |  3 |  4  |
   0   |   c2   |   d3   |  5  |  6 |  7 |  8  |    

How can I reshape my dataframe to move one of my indexes on top of columns_level? Lets say that I want, my current index2 to be placed on column_level0.

Also I need some efficient solution for this problem.

My current solution is to use stack/unstack in the following way:

df.stack().stack().unstack(index2).unstack().unstack()

But using this kind of implementation on huge dataframes will end in consuming to much RAM and taking to much time.

Iulian Stana
  • 1,632
  • 1
  • 14
  • 17
  • possible duplicate of [Turn Pandas Multi-Index into column](http://stackoverflow.com/questions/20110170/turn-pandas-multi-index-into-column) – camdenl Mar 16 '15 at 12:01

1 Answers1

5

If you have:

import numpy as np
import pandas as pd

columns = pd.MultiIndex.from_arrays([['a1','a1','a2','a2'], ['b1','b2','b3','b4']])
index = pd.MultiIndex.from_tuples([(0,'c1','d1'), (0, 'c2', 'd3')])
df = pd.DataFrame(np.arange(1,9).reshape(2,-1), columns=columns, index=index)
#         a1    a2   
#         b1 b2 b3 b4
# 0 c1 d1  1  2  3  4
#   c2 d3  5  6  7  8

then you could use reorder_levels to avoid (most of) those stack/unstack calls:

df.unstack(level=1).reorder_levels([2,0,1], axis=1)

yields

      c1  c2  c1  c2  c1  c2  c1  c2
      a1  a1  a1  a1  a2  a2  a2  a2
      b1  b1  b2  b2  b3  b3  b4  b4
0 d1   1 NaN   2 NaN   3 NaN   4 NaN
  d3 NaN   5 NaN   6 NaN   7 NaN   8
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677