Identifying unique pairs of values from two columns in a dataframe

Question

I have a data frame like the following:

myDf <- data.frame(Var1 = c("dennis", "marcus", "bat" ,"man", "mennis", "cool"), 
                   Var2 = c("mennis", "cool", "man", "bat", "dennis", "marcus"))

> myDf
    Var1    Var2
1 dennis  mennis
2 marcus    cool
3    bat     man
4    man     bat
5 mennis   dennis
6   cool   marcus

What I would like to achieve is a unique pair from both the variables, such as follows:

    Var1    Var2
1 dennis  mennis
2 marcus    cool
3    bat     man

score 4 · Accepted Answer · answered Sep 19 '15 at 14:54

4

We sort by row using apply with MARGIN=1, get a logical index using duplicated and then subset the original dataset based on that.

 myDf[!duplicated(t(apply(myDf, 1, sort))),]
 #    Var1   Var2
 #1 dennis mennis
 #2 marcus   cool
 #3    bat    man

answered Sep 19 '15 at 14:54

akrun

874,273
37
540
662

1

thanks so much, don't know why this never came to my mind for almost half a day... Kudos to you, and 'boos' to me! :) – info_seekeR Sep 19 '15 at 14:58
1

@info_seekeR It may be a bit tricky to think about it. I guess it is a legitimate question, but it may have been asked before. – akrun Sep 19 '15 at 15:00

Identifying unique pairs of values from two columns in a dataframe

1 Answers1