how to merge two dataframe's one having empty data

Question

have two df's

df1 and df2

df1 :

   21   |   20   |   1   |  2   | 3  | 4 | 5  | 8 | 9 | 10

df2 :

1   |   2    |   3    |  4   | 5  
abc     asdf    df       132   248
ban     cat     ball     bcd   aisc

how to merge two df so that i can get the desired output

output needed :

  21   |   20   |   1   |  2   | 3  | 4   |   5  | 8  | 9   | 10
  nan      nan     abc     asdf  df   132     248  nan  nan   nan
  nan      nan     ban     cat   ball bcd     aisc nan  nan   nan

Problem is with duplicated columns names, need unique columns names in both — jezrael, Aug 04 '19 at 11:24
created columns like this df = pd.DataFrame(columns=["",.....] — , Aug 04 '19 at 11:27

Willem Van Onsem · Answer 1 · 2019-08-04T11:29:13.697

You can obtain this with concat(..) [pandas-doc]:

>>> df1
Empty DataFrame
Columns: [21, 20, 1, 2, 3, 4, 5, 8, 9, 10]
Index: []
>>> df2
     1     2     3    4     5
0  abc  asdf    df  132   248
1  ban   cat  ball  bcd  aisc
>>> pd.concat((df1, df2))
     1   10     2   20   21     3    4     5    8    9
0  abc  NaN  asdf  NaN  NaN    df  132   248  NaN  NaN
1  ban  NaN   cat  NaN  NaN  ball  bcd  aisc  NaN  NaN

This will, as the documentation says:

Concatenate pandas objects along a particular axis with optional set logic along the other axes.

Can also add a layer of hierarchical indexing on the concatenation axis, which may be useful if the labels are the same (or overlapping) on the passed axis number.

It will thus make a "union" of the column names of both dataframes, and then fill in NaNs for the columns that are missing in one of the two dataframes for the corresponding columns.

Note: The name of the columns should evidently not occur multiple times. If that happens, than it will of course error, since it is not clear how to handle such situation.

In case a column name occurs multiple times in your empty dataframe, you can resolve that with:

df1 = pd.Dataframe(columns=df1.columns.unique())

as a preprocessing step.

@thoris: this happens if the same column name appears multiple times in your dataframe... — Willem Van Onsem, Aug 04 '19 at 11:25
@thoris: if the duplicates only occur in the *empty* dataframe, you can just create a new one where you filter out the duplicates. — Willem Van Onsem, Aug 04 '19 at 11:29

how to merge two dataframe's one having empty data

1 Answers1