0

Is there a function that treats the elements of a row as set and returns only the first occurrence of each unique set?

In example below, rows 1 and 3 should be considered equal. It should be irrelevant for the function foo whether an element is in col1 or col2.

df <- data.frame(col1 = c('a', 'b', '1'), col2 = c('1', '2', 'a'))


foo(df)

>   col1 col2
> 1    a    1
> 2    b    2
D Pinto
  • 871
  • 9
  • 27

1 Answers1

1

You could do something like this..

df[!duplicated(t(apply(df,1,sort))),]

  col1 col2
1    a    1
2    b    2

It sorts each row (so that a-1 and 1-a end up the same), and then selects only those rows of df that are not duplicates.

Andrew Gustar
  • 17,295
  • 1
  • 22
  • 32