1

I'm using dataframes from pandas and I have 2 tables: The first one:

+----------------------------+
|  ID   | Code | Name | Desc |
------------------------------
| 00002 | AAAA | Aaaa | A111 |
------------------------------
| 12345 | BBBB | Bbbb | B222 |
------------------------------
| 01024 | EEEE | Eeee | E333 |
------------------------------
| 00010 | CCCC | Cccc | C444 |
------------------------------
| 00123 | ZZZZ | Zzzz | Z555 |
------------------------------
| ..... | .... | .... | .... |
+----------------------------+

The second table:

+--------------------------------+
|  EID  | Cat | emaN | No | cseD |
----------------------------------
| 00010 |  1  |      |    |      |
----------------------------------
| 12345 |  1  |      |    |      |      
----------------------------------
|       |  1  |      |    |      |
+--------------------------------+

I want to update the second table with the values ​​from the first one, so that it turns out:

+--------------------------------+
|  EID  | Cat | emaN | No | cseD |
----------------------------------
| 00010 |  1  | Сссс |    | С444 |
----------------------------------
| 12345 |  1  | Bbbb |    | B222 |      
----------------------------------
|       |  1  |      |    |      |
+--------------------------------+

But the difficulty is that the column names are different, the key ID -> EID and the values ​​Name -> emaN, Desc -> cseD, and the column Cat (the values ​​are filled initially) and No (empty values) must remain in the output table unchanged. Also in the 2nd table there can be empty EIDs, so this entry should remain as it was.

How is it possible to make such an update or merge?

Thanks.

4dyc0wzn
  • 45
  • 3

2 Answers2

1

Try pd.merge with right_on and left_on param in case columns name are different on which you have to merge.

I am applying check if final_df['emaN'] is null then copy value from Code.

Then drop the column of df1 which are not require

I have save the result in new df final_df if you want you can save the data in 'df2'

import numpy as np
import pandas as pd

final_df = pd.merge(df2,df1,left_on='EID' ,right_on='ID',how='left')
final_df['emaN'] = np.where(final_df['emaN'].isnull(),final_df['Code'],final_df['emaN'])
final_df['cseD'] = np.where(final_df['cseD'].isnull(),final_df['Desc'],final_df['cseD'])

final_df.drop(['ID','Code','Name','Desc'],axis=1,inplace=True)
tawab_shakeel
  • 3,701
  • 10
  • 26
1

As far as I understand the question...

pd.merge(FirstDataFrame, SecondDataFrame, left_on='ID', right_on='EID', how='left')['EID','Cat','emaN','No','cseD']

or if you want to join on multiple fields

pd.merge(FirstDataFrame, SecondDataFrame, left_on=['ID', 'Name', 'Desc'], 
         right_on=['EID', 'emaN','cseD'], how='left') 
['EID','Cat','emaN','No','cseD']

Edit: (see comments, filter for the desired columns) added above

Marcel Flygare
  • 837
  • 10
  • 19