I need to compare new datasets to existing sql datasets and update them if new information is presenting itself.
data from db:
dfa
id foo bar
1 2 "home"
2 5 "work"
3 6 "car"
4 99 "people"
new data:
dfb
id foo bar
1 22 "home"
2 5 "work"
8 8 "pet"
4 99 "humans"
What I need is a way to recognize that for id 1, there is a different value in column foo and that for id 4 there is a new value for column bar. And then update the dataframe from the db before sending it back to the db. I'd like to do this in a runtime efficient maner.
dfout
id foo bar
1 22 "home"
2 5 "work"
3 6 "car"
4 99 "humans"
I have searched the web for a solution. But I can't find my specific case and I have trouble fitting what I do find into my case. Can someone explain how I would do this?
These seem related but deal with non overlapping data and entire new rows. Pandas sort columns and find difference Python Pandas - Find difference between two data frames