Subtracting rows of dataframe A from dataframe B python pandas

Question

I have two dataframes, let's call them A and B. They have exactly the same 7 columns (let's call them col1, col2, col3, col4, col5, col6 and col7). Some of the columns include client_id, client_first_name, client_last_name, telephone number etc. (I can't reveal the exact names for confidentiality purposes).

DataFrame A is much bigger than DataFrame B and some of the entries from DataFrame B are included in DataFrame A (i.e. DataFrame B is a subset of DataFrame A).

The problem is, I want to make sure that the records in DataFrame A are NOT in DataFrame B, i.e. 'subtract' DataFrame B from DataFrame A. How do I do it?

So far, I've been adding an extra column entitled 'group' for both DataFrames, merging them using pd.merge(A, B, how='left', on='col) and then pulling out the ones that ended up with two different values for 'group_x' and 'group_y' (the merge created these two groups.

Is there an easier way to do it? I tried a bunch of things but none of them worked.

check this aswer: http://stackoverflow.com/a/28902170/2027457 — n1tk, Dec 01 '16 at 00:34

score 0 · Answer 1 · answered Dec 01 '16 at 00:32

0

Yes your way is OK, you could also do something like dfA.ix[!dfA.col.isin(dbB.col)] if you don't need the merged dataframe.

answered Dec 01 '16 at 00:32

maxymoo

35,286
11
92
119

Subtracting rows of dataframe A from dataframe B python pandas

1 Answers1