How to update a dataframe based on data from another dataframe

Asked Sep 13 '22 at 09:52

Active Nov 03 '22 at 14:53

Viewed 17 times

I need to update my main df (df_old) with new values and new rows belonging to the updating df (new_df):

df_old:

df_new:

This is the final result I want to get:

As you can see, now the "Type of Subscription" values are updated and even if I don't have some values like the Leo Wild age, the cell is filled with a NaN.

Honestly, I tried the script explained at this web page: How to compare two dataframes and update with the new value, but it doesn't work when you use it with huge datasets with several columns and different rows index.

This is my script:

import pandas as pd
df_old = pd.read_excel(r"file_name_old.xlsx")
df_new = pd.read_excel(r"file_name_new.xlsx")

if not df_old.equals(df_new):
    df = df_new.combine_first(df_old)
df.to_excel(r"file_name_final.xlsx", `index=False)`

Anyway, I can't obtain what I want with my huge dataset made of 30 columns and thousands of rows.

Can anyone help me?

edited Nov 03 '22 at 14:53

General Grievance

4,555
31
31
45

asked Sep 13 '22 at 09:52

francesco_cocchi

How to update a dataframe based on data from another dataframe

0 Answers0