Update/merge 2 dataftames with different column names

Question

I'm using dataframes from pandas and I have 2 tables: The first one:

+----------------------------+
|  ID   | Code | Name | Desc |
------------------------------
| 00002 | AAAA | Aaaa | A111 |
------------------------------
| 12345 | BBBB | Bbbb | B222 |
------------------------------
| 01024 | EEEE | Eeee | E333 |
------------------------------
| 00010 | CCCC | Cccc | C444 |
------------------------------
| 00123 | ZZZZ | Zzzz | Z555 |
------------------------------
| ..... | .... | .... | .... |
+----------------------------+

The second table:

+--------------------------------+
|  EID  | Cat | emaN | No | cseD |
----------------------------------
| 00010 |  1  |      |    |      |
----------------------------------
| 12345 |  1  |      |    |      |      
----------------------------------
|       |  1  |      |    |      |
+--------------------------------+

I want to update the second table with the values from the first one, so that it turns out:

+--------------------------------+
|  EID  | Cat | emaN | No | cseD |
----------------------------------
| 00010 |  1  | Сссс |    | С444 |
----------------------------------
| 12345 |  1  | Bbbb |    | B222 |      
----------------------------------
|       |  1  |      |    |      |
+--------------------------------+

But the difficulty is that the column names are different, the key ID -> EID and the values Name -> emaN, Desc -> cseD, and the column Cat (the values are filled initially) and No (empty values) must remain in the output table unchanged. Also in the 2nd table there can be empty EIDs, so this entry should remain as it was.

How is it possible to make such an update or merge?

Thanks.

tawab_shakeel · Accepted Answer · 2019-07-16T12:11:34.627

Try pd.merge with right_on and left_on param in case columns name are different on which you have to merge.

I am applying check if final_df['emaN'] is null then copy value from Code.

Then drop the column of df1 which are not require

I have save the result in new df final_df if you want you can save the data in 'df2'

import numpy as np
import pandas as pd

final_df = pd.merge(df2,df1,left_on='EID' ,right_on='ID',how='left')
final_df['emaN'] = np.where(final_df['emaN'].isnull(),final_df['Code'],final_df['emaN'])
final_df['cseD'] = np.where(final_df['cseD'].isnull(),final_df['Desc'],final_df['cseD'])

final_df.drop(['ID','Code','Name','Desc'],axis=1,inplace=True)

Marcel Flygare · Answer 2 · 2019-07-16T12:23:38.280

1

As far as I understand the question...

pd.merge(FirstDataFrame, SecondDataFrame, left_on='ID', right_on='EID', how='left')['EID','Cat','emaN','No','cseD']

or if you want to join on multiple fields

pd.merge(FirstDataFrame, SecondDataFrame, left_on=['ID', 'Name', 'Desc'], 
         right_on=['EID', 'emaN','cseD'], how='left') 
['EID','Cat','emaN','No','cseD']

Edit: (see comments, filter for the desired columns) added above

edited Jul 16 '19 at 12:23

answered Jul 16 '19 at 12:10

Marcel Flygare

837
10
19

you are merging right but the values would not be in the desired column – tawab_shakeel Jul 16 '19 at 12:14

Update/merge 2 dataftames with different column names

2 Answers2