Override non null values from one dataframe to another

Question

I would like to override non null values from a dataframe to another dataframe with combination of first row and column (both being unique).

Basically, i am trying to join df2 on df1 only for non null values in df2, keeping df1 rows/column intact.

eg:

df1 =

df2 =

output =

Does this answer your question? [Pandas Merging 101](https://stackoverflow.com/questions/53645882/pandas-merging-101) — BeRT2me, Jul 01 '22 at 06:04

Yury · Answer 1 · 2022-07-06T14:00:50.213

This should work:

output = df1.merge(df2, on='ID')
cols = [c for c in df1.columns if c!='ID']
for col in cols:
   output[col] = output[f'{col}_x'].fillna(output[f'{col}_y'])
   output.drop(columns=[f'{col}_x', f'{col}_y'], inplace=True)

Explanation:

At first, we merge two dataframes using ID as a key. The merge joins two dataframes and if there are columns with the same name it adds suffixes _x and _y.
Then we iterate over all the columns in df1 and fill the NA values in the column col_x using on the values in col_y and put the value into a new column col.
We drop the auxiliary columns col_x and col_y

Edit: Still, even with the updated requirements the approach is similar. However, in this case, you need to perform a left outer join and fillna values of the second dataframe. Here is the code:

output = df1.merge(df2, on='ID', how='left')
cols = [c for c in df1.columns if c!='ID']
for col in cols:
   output[col] = output[f'{col}_y'].fillna(output[f'{col}_x'])
   output.drop(columns=[f'{col}_x', f'{col}_y'], inplace=True)

Thanks for above. But i need to update anything NON NULL in df2 to df1 on ID. For eg: In df1 if you see when ID =1 Address = A (Non Null) and in output for ID =1 Address being updated to Ind. — Ajit Singh, Jul 04 '22 at 05:09

Override non null values from one dataframe to another

1 Answers1