1

I have two dataframe d1 and d2 as follows:
d1

A   B   C    D
X   6   7   23
Y   3   4   35
B   5   8   45
D   11  7   15
L   10  1   5

d2

A   C 
Y   4 
L   1 
D   7 

and want new column in d1 E as follow:

A   B   C   D   E
X   6   7   23  0
Y   3   4   35  1
B   5   8   45  0
D  11   7   15  1
L  10   1   5   1

new column E is added based on the condition of data in column A & C of d2 matched with data in column A & C of d1, number of rows are different in booth dataframe.

cs95
  • 379,657
  • 97
  • 704
  • 746
  • 2
    https://stackoverflow.com/questions/48647534/python-pandas-find-difference-between-two-data-frames/48647840#48647840 almost same:-) – BENY Jun 28 '19 at 18:16

1 Answers1

5

I'll be using merge (for a LEFT OUTER JOIN) with an indicator:

d1['E'] = (d1.merge(d2, on=['A', 'C'], how='left', indicator=True)
             .eval('_merge == "both"')
             .astype(int))
d1
   A   B  C   D  E
0  X   6  7  23  0
1  Y   3  4  35  1
2  B   5  8  45  0
3  D  11  7  15  1
4  L  10  1   5  1

Here's an inefficient approach using tuples conversion with isin checking:

d1['E'] = (
    d1[['A', 'C']].apply(tuple, axis=1).isin(d2.apply(tuple, axis=1)).astype(int))
d1
   A   B  C   D  E
0  X   6  7  23  0
1  Y   3  4  35  1
2  B   5  8  45  0
3  D  11  7  15  1
4  L  10  1   5  1

I'd rather post this myself, that way I get to tell you I don't recommend its use.

cs95
  • 379,657
  • 97
  • 704
  • 746