I have two dataframes:
df1
A B id col1 col2 col3
0 E1 E2 0 NaN NaN NaN
1 E1 E3 1 NaN NaN NaN
2 E1 E4 2 NaN NaN NaN
3 E2 E1 3 NaN NaN NaN
4 E1 E4 4 NaN NaN NaN
5 E2 E1 5 NaN NaN NaN
df2
A B id col1 col2 col3
0 E1 E2 3 1 0 1
1 E1 E3 5 0 1 1
I want to update the values in col1
, col2
, col3
in df1
by taking those values in df2
by matching on id
to get:
df3
A B id col1 col2 col3
0 E1 E2 0 NaN NaN NaN
1 E1 E3 1 NaN NaN NaN
2 E1 E4 2 NaN NaN NaN
3 E2 E1 3 1 0 1
4 E1 E4 4 NaN NaN NaN
5 E2 E1 5 0 1 1
As my actual dataframe is much larger, I want to use the list of the column names that I would like to update:
add = ['col1', 'col2', 'col3']
How can I use this column names to get the desired result?
I referred to this question and this question which directed me to use .loc
but I can't figure out how to incorporate a reference to the index and the list of multiple columns a la:
df1.loc[df1['edge_id'] == df2['edge_id'], add] = df2[add]
Obviously this didn't work...