0

I have one main df that has one column I would like to update with values from a second df1.

The tricky part for me is that I need a match on 2 common columns from each df to know which value to update.

Using an example:

df  col1  col2 col3
1    1A    Z4   4
2    1B    Z5   2
3    1C    Z6   7
4    1D    Z7   1
5    1E    Z12  9

df1  col1  col2 col3
1    1G    Z9   1
2    1B    Z5   2
3    1C    Z6   3
4    1D    Z7   4
5    1E    Z8   5

Output:

df  col1  col2 col3
1    1A    Z4   4 (no match, no update)
2    1B    Z5   2 (match, updated)
3    1C    Z6   3 (match, updated)
4    1D    Z7   4 (match, updated)
5    1E    Z12  9 (not matched on both, no update)

Thank you for your assistance.

swifty
  • 1,182
  • 1
  • 15
  • 36

2 Answers2

2

You can using set_index with update

df1=df1.set_index(['col1','col2'])
df1.update(df2.set_index(['col1','col2']))
df1.reset_index(inplace=True)
df1
Out[528]: 
  col1 col2  col3
0   1A   Z4   4.0
1   1B   Z5   2.0
2   1C   Z6   3.0
3   1D   Z7   4.0
4   1E  Z12   9.0
BENY
  • 317,841
  • 20
  • 164
  • 234
  • Ah thanks very much, that makes sense. Had to fiddle around a bit with a single duplicate multiindex but got there! – swifty Jul 26 '18 at 05:03
0

By using numpy.where along with ternary operator which i found from @jezrael's solution.

df['col3'] = np.where(df['col1'].isin(df1['col1']) & df['col2'].isin(df1['col2']), df1['col3'], df['col3'])
msr_003
  • 1,205
  • 2
  • 10
  • 25