1

A few data frames identical in size, columns and indices but varying a bit in content are concatenated. What would be the best way of generating a new index such that the original indices remain the same, but the outermost index now represents the data frame number that was concatenated?

DataFrame A:
Idx1 | Idx2 || Col
0      0       'A'
       1       'B'
1      0       'C'
       1       'D'

DataFrame B:
Idx1 | Idx2 || Col
0      0       'E'
       1       'F'
1      0       'G'
       1       'H'

DataFrame AB:
Idx0 | Idx1 | Idx2 || Col
0      0      0       'A'
              1       'B'
       1      0       'C'
              1       'D'
1      0      0       'E'
              1       'F'
       1      0       'G'
              1       'H'

I'm still not so comfortable with my indices and stacking, but I imagine that would be needed. Any help is greatly appreciated!

Sandbox:

A = pd.DataFrame({'Col': ['A', 'B', 'C', 'D'], 'Idx1': [0,0,1,1], 'Idx2':[0,1,0,1]})
B = pd.DataFrame({'Col': ['E', 'F', 'G', 'H'], 'Idx1': [0,0,1,1], 'Idx2':[0,1,0,1]})
A.set_index(keys=['Idx1', 'Idx2'], inplace=True)
B.set_index(keys=['Idx1', 'Idx2'], inplace=True)
Felix
  • 2,548
  • 19
  • 48
  • 1
    https://stackoverflow.com/questions/49620538/what-are-the-levels-keys-and-names-arguments-for-in-pandas-concat-functio/49620539#49620539 – BENY Aug 15 '18 at 18:09
  • @Wen Thank you my friend, looks good! – Felix Aug 15 '18 at 18:12

1 Answers1

2

Use concat with the keys and names arguments:

pd.concat([A, B], keys=[0, 1], names=['Idx0'])

The resulting output:

               Col
Idx0 Idx1 Idx2    
0    0    0      A
          1      B
     1    0      C
          1      D
1    0    0      E
          1      F
     1    0      G
          1      H
root
  • 32,715
  • 6
  • 74
  • 87