I need to update my main df (df_old) with new values and new rows belonging to the updating df (new_df):
df_old:
df_new:
This is the final result I want to get:
As you can see, now the "Type of Subscription" values are updated and even if I don't have some values like the Leo Wild age, the cell is filled with a NaN.
Honestly, I tried the script explained at this web page: How to compare two dataframes and update with the new value, but it doesn't work when you use it with huge datasets with several columns and different rows index.
This is my script:
import pandas as pd
df_old = pd.read_excel(r"file_name_old.xlsx")
df_new = pd.read_excel(r"file_name_new.xlsx")
if not df_old.equals(df_new):
df = df_new.combine_first(df_old)
df.to_excel(r"file_name_final.xlsx", `index=False)`
Anyway, I can't obtain what I want with my huge dataset made of 30 columns and thousands of rows.
Can anyone help me?