Updating a dataframe rows based on another dataframe rows

Question

I have two dataframes, and one bigger dataframe needs to be updated based on data from smaller dataframe. So basically if there is a record with matching names I want to update the price in df1, just like in an example below. There might be multiple rows with the same name in df1

df1

id name price 

1 name_1 5,34 

2 name_2 5,36 

3 name_3 4,74 

4 name_4 5,23 

5 name_5 5,94 

6 name_1 5,34

df2

name price 

name_4 5,17 

name_1 5,37

df_result

id name price 

1 name_1 5,37 

2 name_2 5,36 

3 name_3 4,74 

4 name_4 5,17 

5 name_5 5,94 

6 name_1 5,37

I'm quite stuck. Tried doing this with df.loc[] but I got nowhere. Any ideas?

`Tried doing this with df.loc[] but I got nowhere` - add the code from you attempt to your question — Dan, Oct 02 '20 at 11:35
Use `pd.concat([df1,df2]).drop_duplicates(subset=['name'], keep='last')` — jezrael, Oct 02 '20 at 11:40
Alternatively `df1.merge(df2, on="name", how="left").ffill(axis=1).drop("price_x", axis=1)` — Dan, Oct 02 '20 at 11:43
@jezrael your method doesn't preserve the `"id"` column on the replaced rows (nor the index if that matters). I'm not sure this is an exact duplicate. — Dan, Oct 02 '20 at 11:44

Dan · Answer 1 · 2020-10-02T13:25:58.333

2

You are trying to do multiple one-to-one matches, merge can help you here:

df1.merge(df2, on="name", how="left").ffill(axis=1).drop("price_x", axis=1)

by doing a left join, you keep all the values in df1 that don't have matches in df2. The ffill then does null-coallesing where you keep the right most non-null column.

Another option based on Sandeep's answer:

df3 = df1.set_index("name")
df3.update(df2.set_index("name")).reset_index()

edited Oct 02 '20 at 13:25

answered Oct 02 '20 at 11:47

Dan

45,079
17
88
157

I updated my question. I didn't write one important thing - there are places where there are multiple rows with the same name but with different id – Tomasz Słowik Oct 02 '20 at 12:58
@TomaszSłowik I think my solution works as is for that case...? – Dan Oct 02 '20 at 13:08
It does! Thank you :) Had to play around since I have some mess with my data. – Tomasz Słowik Oct 02 '20 at 13:18
@TomaszSłowik btw check my comment on Sandeeps answer, it might be cleaner for you – Dan Oct 02 '20 at 13:23

score 0 · Accepted Answer · answered Oct 02 '20 at 11:48

0

you can use update...like so..

df1.update(df2)

answered Oct 02 '20 at 11:48

Sandeep Kothari

405
3
6

this could work if you set the index to name first. So `df3 = df1.set_index("name")` and then `df3.update(df2.set_index("name")).reset_index()` – Dan Oct 02 '20 at 13:11

Updating a dataframe rows based on another dataframe rows

2 Answers2

Linked