0

I've been using pandas for some months and today I found something weird.

Let's say I have these two dataframes :

df1 = pd.DataFrame(data={'C1' : [1,1,1,2],'C2' : ['A','B','C','D']})
df2 = pd.DataFrame(data={'C1':[2,2,2],'C2':['A','B','C']})

What I want is : from df2, every pairs of {C1,C2} that exist in df1.

This is what I wrote : df2[df2.C1.isin(df1.C1) & df2.C2.isin(df1.C2)]

The result I would like to have is an empty dataFrame because in df1, 2 is not linked with 'A','B' or 'C' and what I get is df2. I tried df2[df2[["C1,"C2"]].isin(df1[["C1,"C2"]])] but it does not work if df2 has more columns (even if unused).

Maxime D.
  • 306
  • 5
  • 17
  • note that isin gives u booleans. as such you are working with booleans and not the actual value, as such each condition gives u True for all rows, and when you index, u definitely will get back the same dataframe. zipa's solution offers the right way to get what u want. just felt i should explain why u had the outcome – sammywemmy Mar 13 '20 at 09:10

1 Answers1

2

You can do it with inner merge:

df2.merge(df1, how='inner', on=['C1', 'C2'])

Empty DataFrame
Columns: [C1, C2]
Index: []
zipa
  • 27,316
  • 6
  • 40
  • 58
  • Thank you for your answer, it works well. Also, I would like to know why two 'isin' in one condition does not work ? – Maxime D. Mar 13 '20 at 09:12
  • @Mortimer6969 Your `isin` returns two Series each having all `True` because each element of `df2.C1` is in fact present in `df1.C1`. The same goes for `df2.C2`. What you end up with is two arrays, both having all `True`, that are compared using `&` operator. Needless to say that `True & True` evaluate into `True`. – zipa Mar 13 '20 at 09:20