1

I have multiple DataFrames and they all have different sizes:

a

Header1 Header2
111 item1 item2
112 item3 item4
a = pd.DataFrame({
  'Header1': ['item1', 'item3'],
  'Header2': ['item2', 'item4']
}).set_index(pd.Index([111, 112]))

b

Header3 Header4
111 item5 item6
111 item7 item8
111 item9 item10
112 item11 item12
112 item13 item14
112 item15 item16
b = pd.DataFrame({
  'Header3': ['item5', 'item7', 'item9', 'item11', 'item13', 'item15'],
  'Header4': ['item6', 'item8', 'item10', 'item12', 'item14', 'item16']
}).set_index(pd.Index([111, 111, 111, 112,  112, 112]))

I would like to concatenate these two DataFrames like this:

c

Header1 Header2 Header3 Header4
111 item1 item2 item5 item6
111 - - item7 item8
111 - - item9 item10
112 item3 item4 item11 item12
112 - - item13 item14
112 - - item15 item16

When I add rows into the smaller DataFrames I still don't know how to line up the rows like this.

Po Chen Liu
  • 253
  • 2
  • 12

2 Answers2

1

You can simply use pd.concat, which automatically takes indexes into account:

c = pd.concat([a, b], axis=1)

Output:

>>> c
    Header1 Header2 Header3 Header4
111   item1   item2   item5   item6
111   item1   item2   item7   item8
111   item1   item2   item9  item10
112   item3   item4  item11  item12
112   item3   item4  item13  item14
112   item3   item4  item15  item16
1

Assuming the dashes in Header1 and Header2 represent filling the same values, it's simplest to join a and b:

c = a.join(b)

#     Header1 Header2 Header3 Header4
# 111   item1   item2   item5   item6
# 111   item1   item2   item7   item8
# 111   item1   item2   item9  item10
# 112   item3   item4  item11  item12
# 112   item3   item4  item13  item14
# 112   item3   item4  item15  item16
tdy
  • 36,675
  • 19
  • 86
  • 83