R - make a unique list of two columns interchangeably

Question

I have a dataframe that I need to make it unique based on two columns interchangeably meaning:

dataframe:

df <- data.frame(col1=c("a",1,"bar","foo"),col2=c(1,"a","foo","bar"))

my goal is to keep only one instance of the two rows that contain the same data so, for example, keep foo-bar or bar-foo would suffice my need

an output can be:

score 4 · Accepted Answer · answered Feb 24 '20 at 20:22

4

Here is a base R way.

inx <- !duplicated(t(apply(df, 1, sort)))
df[inx, ]

One-liner:

df[!duplicated(t(apply(df, 1, sort))), ]
#  col1 col2
#1    a    1
#3  bar  foo

answered Feb 24 '20 at 20:22

Rui Barradas

70,273
8
34
66

score 1 · Answer 2 · answered Feb 24 '20 at 20:24

1

Based on Sorting each row of a data frame you can do

unique(t(apply(df, 1, sort)))

answered Feb 24 '20 at 20:24

Simon Woodward

1,946
1
16
24

1

Convert that back to a data frame if needed. – G. Grothendieck Feb 24 '20 at 20:26

score 0 · Answer 3 · answered Feb 24 '20 at 20:20

0

Under the assumption your data is paired all the way through, a for loop can solve this for you.

## Establish the data.frame to write into
df2 <- df[1,]
## Loop through the remaining information
for( i in 2:nrow(df) ){
    df2 <- rbind(df2, 
                 if(df[i,"col2"] %in% df2[,"col1"] ){ next }else{ df[i,] } )
}

answered Feb 24 '20 at 20:20

Badger

1,043
10
25

2

this will take so much time if the df is large, I have already found some solutions using `dplyr` and `data.table`, I was trying to find the easiest way – Ibo Feb 24 '20 at 20:22

R - make a unique list of two columns interchangeably

3 Answers3

Related