0

How to update certain values in one column of a dataframe with values from another dataframe:

# Main dataframe
df1 = pd.DataFrame({'a':[1,2,3,4,5], 'b':['a2', 'a1', '?', '?', 'b2'], 'c':[100,101, 102, 103, 104]})

# Dataframe with new values for column 'b'
df2 = pd.DataFrame({'a':[3,4], 'b':['b5', 'c5']})

# Result: Main dataframe with updated values for 'b', using column 'a' as index/key
pd.DataFrame({'a':[1,2,3,4,5], 'b':['a2', 'a1', 'b5', 'c5', 'b2'], 'c':[100,101, 102, 103, 104]})
user2696565
  • 587
  • 1
  • 8
  • 17

1 Answers1

1

You can replace the ? with numpy.nan and then use pandas' combine_first

df1 = df1.replace('?', np.nan)
df1.set_index('a').combine_first(df2.set_index('a')).reset_index()
It_is_Chris
  • 13,504
  • 2
  • 23
  • 41