I found this post, but it's not quite my scenario. Is there an efficient way of comparing two data frames
The reason I want to compare two dataframes is that I am looking for changes that may have occured. (Think "audit"). The two frames have exactly the same column layout, just that one may have more or less rows than the other, and values may have changed in two of the columns.
ID Period Price OtherID
0 10100001 2019-10-01 8995.00 ABBD
1 38730001 2019-11-01 38227.71 EIRU
2 30100453 2019-12-01 22307.00 DDHF
3 92835543 2020-01-01 2310.00 DDGF
4 66453422 2020-02-01 12113.29 DGFH
ID Period Price OtherID
0 10100001 2019-10-01 5.00 ABBD
1 38730001 2019-11-01 38227.71 XXXX
2 30100453 2019-12-01 22307.00 DDHF
3 92835543 2020-01-01 2310.00 DDGF
4 66453422 2020-02-01 12113.29 DGFH
5 22223422 2020-02-01 123.29 HHYG
The two columns that I am suspicious about are "Price" and "OtherID". I want to find any changes in the Price or OtherID.
I see three scenarios:
1. row has been added
2. row has been deleted
3. row doesn't match
I can iterate through it all, but I'm wondering if there is some pandas magic out there that will do it in one fell swoop.
The output I seek is something like this:
ID Period Analysis
0 10100001 2019-10-01 Changed
1 38730001 2019-11-01 Changed
3 22223422 2020-02-01 New
And just to be clear, ID by itself is not unique. And Period by itself is also not unique. The two together are.