1

I have a dataframe look like this:

df = pd.DataFrame({'A':['yes','yes','yes','yes','no','no','yes','yes','yes','no'],
                   'B':['yes','no','no','no','yes','yes','no','yes','yes','no']})

df
----------------------------
index         A        B
0           yes      yes
1           yes       no
2           yes       no
3           yes       no
4            no      yes
5            no      yes
6           yes       no
7           yes      yes
8           yes      yes
9            no       no
-----------------------------

The ideal output would be like:

----------------------------
          A       B       
----------------------------
0         no       no           
1        yes       no       
2        yes      yes      
----------------------------

Instead of having four combinations of yes and no, there are only 3 combos, so yes no and no yes will be the same pair, the frequency of each pair doesn't really matter.

I've tried using groupby but obviously it will give you 4 pairs, I've also tried pd.unique. Very similar problem to this so post, but not entirely the same, and I borrowed the example from there. Thanks yall!

timxymo1225
  • 481
  • 1
  • 4
  • 13

1 Answers1

2

Use np.sort:

pd.DataFrame(np.sort(df,axis=1),columns =df.columns ).drop_duplicates()
ansev
  • 30,322
  • 5
  • 17
  • 31