I have multiple pandas dataframes, to keep it simple, let's say I have three.
>> df1=
col1 col2
id1 A B
id2 C D
id3 B A
id4 E F
>> df2=
col1 col2
id1 B A
id2 D C
id3 M N
id4 F E
>> df3=
col1 col2
id1 A B
id2 D C
id3 N M
id4 E F
The result needed is :
>> df=
col1 col2
id1 A B
id2 C D
id3 E F
Because the pairs (A, B),(C, D),(E, F) appear in all the data frames although it may be reversed.
While using pandas merge it just considers the way columns are passed. To check my observation I tried the following code for two data frames:
df1['reverse_1'] = (df1.col1+df1.col2).isin(df2.col1 + df2.col2)
df1['reverse_2'] = (df1.col1+df1.col2).isin(df2.col2 + df2.col1)
And I found that the results differ:
col1 col2 reverse_1 reverse_2
a b False True
c d False True
b a True False
e f False True
So, if I collect 'True' values from both reverse_1 and reverse_2 columns, I can get the intersect of both the data frames. Even if I do it for two data frames it's not clear to me how to proceed with more data frames (more than two). I am little confused about that. Any suggestions?