1

I'd like to remove duplicates from a dataframe (df) if two columns have the same values, even if those values are in the reverse order.My actual data set has 7046 rows.

This is sample data,

> df
part_no.   alt_part_no
    1           2
    1           3
    2           1
    2           3
    3           1
    3           2
    4           5
    5           4
    6           7
    6           8
    6           9
    7           6
    7           8  
    7           9
    8           6
    8           7
    8           9
    9           6
    9           7 
    9           8

I want to generate a new data frame without duplicates, for example, Row 1 and 2 is same as Row 3,4,5,6.Since they have the same information I would like a final file without duplicates, I would like a file like the one below:

>output
part_no.   alt_part_no
1           2
1           3
4           5
6           7
6           8
6           9

Can someone help? The unique command wont work with this and I don't know how to do it.

I tried

df[!duplicated(t(apply(df, 1, sort))),]

>output

     part_no.    alt_part_no
         1           2
         1           3
         2           3
         4           5
         6           7
         6           8
         6           9
         7           8
         7           9
         8           9
B.Dumindu
  • 11
  • 3

0 Answers0