Pandas find duplicates with reversed values between columns

Question

What is the fastest way to find duplicates where value from Column A have been reversed with value from Column B?

For example, if I have a DataFrame with :

     Column A           Column B
0           C                  P
1           D                  C
2           L                  G
3           A                  D
4           B                  P
5           B                  G
6           P                  B
7           J                  T
8           P                  C
9           J                  T

The result will be :

     Column A           Column B
0           C                  P
8           P                  C
4           B                  P
6           P                  B

I tried :

df1 = df
df2 = df
for i in df2.index:
     res = df1[(df1['Column A'] == df2['Column A'][i]) & (df1['Column B'] == df2['Column B'][i])]

But this is very slow and it iterates over the same values...

What have you tried so far? – DirtyBit Jan 22 '19 at 14:00 — DirtyBit, Jan 22 '19 at 14:00

score 5 · Accepted Answer · answered Jan 22 '19 at 14:12

5

Use merge with renamed columns DataFrame:

d = {'Column A':'Column B','Column B':'Column A'}
df = df.merge(df.rename(columns=d))
print (df)
  Column A Column B
0        C        P
1        B        P
2        P        B
3        P        C

answered Jan 22 '19 at 14:12

jezrael

822,522
95
1,334
1,252

score 0 · Answer 2 · answered Jan 22 '19 at 14:06

0

You could try using reindex for inversion.

column_Names=["B","A"]
df=df.reindex(columns=column_Names)

Or you could just do this:

col_list = list(df)   # get a list of the columns
col_list[0], col_list[1] = col_list[1], col_list[0]
df.columns = col_list   # assign back

answered Jan 22 '19 at 14:06

Sumanth

497
4
12

Thanks but this is not really what I'm after. This will indeed reverse the columns but I would like to know if there is duplicates in my df where the values between two columns have been reversed. – ldoe Jan 22 '19 at 14:14
Thanks to you and @jezrael that lead me on the way to the solution. Just reverse the columns and merge ! Thanks. – ldoe Jan 22 '19 at 14:17

Pandas find duplicates with reversed values between columns

2 Answers2