0

have two df's

df1 and df2

df1 :

   21   |   20   |   1   |  2   | 3  | 4 | 5  | 8 | 9 | 10

df2 :

1   |   2    |   3    |  4   | 5  
abc     asdf    df       132   248
ban     cat     ball     bcd   aisc

how to merge two df so that i can get the desired output

output needed :

  21   |   20   |   1   |  2   | 3  | 4   |   5  | 8  | 9   | 10
  nan      nan     abc     asdf  df   132     248  nan  nan   nan
  nan      nan     ban     cat   ball bcd     aisc nan  nan   nan

1 Answers1

0

You can obtain this with concat(..) [pandas-doc]:

>>> df1
Empty DataFrame
Columns: [21, 20, 1, 2, 3, 4, 5, 8, 9, 10]
Index: []
>>> df2
     1     2     3    4     5
0  abc  asdf    df  132   248
1  ban   cat  ball  bcd  aisc
>>> pd.concat((df1, df2))
     1   10     2   20   21     3    4     5    8    9
0  abc  NaN  asdf  NaN  NaN    df  132   248  NaN  NaN
1  ban  NaN   cat  NaN  NaN  ball  bcd  aisc  NaN  NaN

This will, as the documentation says:

Concatenate pandas objects along a particular axis with optional set logic along the other axes.

Can also add a layer of hierarchical indexing on the concatenation axis, which may be useful if the labels are the same (or overlapping) on the passed axis number.

It will thus make a "union" of the column names of both dataframes, and then fill in NaNs for the columns that are missing in one of the two dataframes for the corresponding columns.

Note: The name of the columns should evidently not occur multiple times. If that happens, than it will of course error, since it is not clear how to handle such situation.

In case a column name occurs multiple times in your empty dataframe, you can resolve that with:

df1 = pd.Dataframe(columns=df1.columns.unique())

as a preprocessing step.

Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555