1

I have two dataframes, and one bigger dataframe needs to be updated based on data from smaller dataframe. So basically if there is a record with matching names I want to update the price in df1, just like in an example below. There might be multiple rows with the same name in df1

df1

id name price 

1 name_1 5,34 

2 name_2 5,36 

3 name_3 4,74 

4 name_4 5,23 

5 name_5 5,94 

6 name_1 5,34 

df2

name price 

name_4 5,17 

name_1 5,37

df_result

id name price 

1 name_1 5,37 

2 name_2 5,36 

3 name_3 4,74 

4 name_4 5,17 

5 name_5 5,94 

6 name_1 5,37

I'm quite stuck. Tried doing this with df.loc[] but I got nowhere. Any ideas?

  • `Tried doing this with df.loc[] but I got nowhere` - add the code from you attempt to your question – Dan Oct 02 '20 at 11:35
  • Use `pd.concat([df1,df2]).drop_duplicates(subset=['name'], keep='last')` – jezrael Oct 02 '20 at 11:40
  • 1
    Alternatively `df1.merge(df2, on="name", how="left").ffill(axis=1).drop("price_x", axis=1)` – Dan Oct 02 '20 at 11:43
  • 1
    @jezrael your method doesn't preserve the `"id"` column on the replaced rows (nor the index if that matters). I'm not sure this is an exact duplicate. – Dan Oct 02 '20 at 11:44
  • @Dan - Agree, reopened – jezrael Oct 02 '20 at 11:45

2 Answers2

2

You are trying to do multiple one-to-one matches, merge can help you here:

df1.merge(df2, on="name", how="left").ffill(axis=1).drop("price_x", axis=1)

by doing a left join, you keep all the values in df1 that don't have matches in df2. The ffill then does null-coallesing where you keep the right most non-null column.


Another option based on Sandeep's answer:

df3 = df1.set_index("name")
df3.update(df2.set_index("name")).reset_index()
Dan
  • 45,079
  • 17
  • 88
  • 157
0

you can use update...like so..

df1.update(df2)
Sandeep Kothari
  • 405
  • 3
  • 6
  • this could work if you set the index to name first. So `df3 = df1.set_index("name")` and then `df3.update(df2.set_index("name")).reset_index()` – Dan Oct 02 '20 at 13:11