1

I would like to compare two dataframes and identify which particular cell has been updated

I am only able to get the changes at dataframe level, not able to zoom into cell level.

Company_ID is the unique key and won't be changed

df1 is the original data

Company_ID  Name    Grade
1           abc      A
2           zzz      B
3           xxx      C
4           yyy      D

df2 is the updated data

Company_ID  Name    Grade
2           zzz      B
3           xxx_new  C
4           yyy      D+
5           xyz      E

I need to tell,

Company_ID 1 has been removed, Company_ID 3 Name has been updated from xxx to xxx_new Company_ID 4 Grade has been update from D to D+ Company_ID 5 has been added

the output can be a summary txt file, to highlight the particular cell which had been updated only. My actual work has more than 70 columns, i have scale it down to 3 columns for example here, i am not able to zoom into cell level to tell which particular cell has been changed

amy
  • 33
  • 6
  • What do you expect your output to be? These are all sorts of different changes at a record level – simplycoding Jun 18 '19 at 02:56
  • Maybe try one of these answers https://stackoverflow.com/questions/36891977/pandas-diff-of-two-dataframes – brennan Jun 18 '19 at 02:57
  • the output can be a summary txt file, to highlight the particular cell which had been updated only. My actual work has more than 70 columns, i have scale it down to 3 columns for example here, i am not able to zoom into cell level to tell which particular cell has been changed – amy Jun 18 '19 at 03:01

0 Answers0