0
iterables = [[0,1], ["l0", "l1", 'l2']]

df = pd.DataFrame(np.random.randn(4, 6))
df.columns = pd.MultiIndex.from_product(iterables, names=["first", "second"])

and the df looks like

first   0                                     1
second  l0       l1             l2            l0        l1         l2
0   0.906053    0.398538    -1.094092   -0.287806   0.134784    -2.274308
1   -0.594186   0.496815    2.162056    -0.963402   -0.520975   1.485731
2   0.936087    -0.868435   0.660429    0.204297    -0.791717   1.186809
3   2.672840    -0.994447   0.544952    -0.109592   2.698411    -2.294639

which I'd like to convert to some thing looks like

first   second    l0         l1        l2                                             
              
0               0.906053    0.398538    -1.094092   
               -0.594186    0.496815    2.162056    
                0.936087    -0.868435   0.660429    
1               2.672840    -0.994447   0.544952    
                -0.287806   0.134784    -2.274308
                -0.963402   -0.520975   1.485731
2               -0.963402   -0.520975   1.485731
                0.204297    -0.791717   1.186809
               -0.109592    2.698411    -2.294639

basically, I want to split the dataframe into groups based on the 1st level of the multi-index, and then concat them vertically.. if I do

df.unstack(level=0)

then it will change to

  first  second   
    0      l0      0    0.906053
                   1   -0.594186
                   2    0.936087
                   3    2.672840
           l1      0    0.398538
                   1    0.496815
                   2   -0.868435
                   3   -0.994447
           l2      0   -1.094092
                   1    2.162056
                   2    0.660429
                   3    0.544952
    1      l0      0   -0.287806
                   1   -0.963402
                   2    0.204297
                   3   -0.109592
           l1      0    0.134784
                   1   -0.520975
                   2   -0.791717
                   3    2.698411
           l2      0   -2.274308
                   1    1.485731
                   2    1.186809
                   3   -2.294639

and

df.stack(level=0)

returns

second  l0  l1  l2
first           
0   0   0.906053    0.398538    -1.094092
    1   -0.287806   0.134784    -2.274308
1   0   -0.594186   0.496815    2.162056
    1   -0.963402   -0.520975   1.485731
2   0   0.936087    -0.868435   0.660429
    1   0.204297    -0.791717   1.186809
3   0   2.672840    -0.994447   0.544952
    1   -0.109592   2.698411    -2.294639

of which the row is not indexed corretly. hope I made it clear!

  • @anky thanks! The unstack function returns a multiindexed row series but without column index. I want a index for row and a index for column.. – zhuang vapeur Apr 18 '21 at 11:50
  • We would need a [reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) to be able to say whats wrong, you can take help from [here](How to make good reproducible pandas examples ) – anky Apr 18 '21 at 11:52
  • [How to create minimal and reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – anky Apr 18 '21 at 12:06
  • I tried to include an example, hope I made it clear! – zhuang vapeur Apr 18 '21 at 12:34
  • `df.stack(0)` didnot work for this requirement? I posted that too in the comment. BDW thanks for making the question clearer:) – anky Apr 18 '21 at 12:36
  • no, the row is not indexed corretly. It should be indexed by 0 and 1 on the 1st level, and 0123..on the second level – zhuang vapeur Apr 18 '21 at 13:05

2 Answers2

0

IIUC, you need to stack followed by swaplevel and sort_index

df.stack(level=0).swaplevel().sort_index()

second        l0        l1        l2
first                                
0     0 -0.558788  0.594744  1.065517
      1 -0.802450  0.699085 -1.900555
      2  1.230134  0.846437 -0.423816
      3 -1.502994  1.177891 -0.066557
1     0 -2.302171 -0.229552 -0.797856
      1  0.461626 -0.832619  1.960905
      2  0.122817  1.010266 -1.609697
      3 -0.605909 -0.069873  0.777179
anky
  • 74,114
  • 11
  • 41
  • 70
0

I managed to achieve it by

df.stack(level=0).swaplevel(i=0, j=1, axis = 0).reset_index().sort_values(['first'], ascending=True)