1

I am trying to return difference between two data frames but in only some columns.

df1:

df 1

df2:

df 2

I wrote below to filter difference

df = df.merge (saved, indicator=True, how='left').loc[lambda x: x['_merge']!='both']

And it returned

returned

But I want to return rows that are different in only colA and colB instead of filtering entirely identical rows only, so I can get below dataframe;

what I want:

what I want

M--
  • 25,431
  • 8
  • 61
  • 93
dkcloud9
  • 149
  • 1
  • 1
  • 7
  • 1
    https://stackoverflow.com/questions/48647534/python-pandas-find-difference-between-two-data-frames/48647840#48647840 – BENY Feb 26 '20 at 14:08

1 Answers1

0

You can specify parameter on in DataFrame.merge for joined columns:

df = (df.merge(saved, indicator=True, how='left', on=['colA','colB'])
        .loc[lambda x : x['_merge']!='both'])
print (df)
  colA  colB colC_x colC_y     _merge
2    C     3      Y    NaN  left_only
3    D     4      X    NaN  left_only
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252