-1

I am trying to remove those rows if the swap also exists in the data frame.

For example, if I have a data frame:

1 2
1 3
1 4
2 4
4 2
2 1

Then the row (1,2), (2,4) will be removed because (2,1) and (4,2) are also in the df. Is there any fast and neat way to do it? Thank you!

DigiPath
  • 179
  • 2
  • 10

2 Answers2

1

You can row-wise sort the columns and then select only the unique ones :

library(dplyr)

df %>%
 mutate(col1 = pmin(V1, V2), 
        col2 = pmax(V1, V2)) %>%
 distinct(col1, col2)

#  col1 col2
#1    1    2
#2    1    3
#3    1    4
#4    2    4

Using base R :

df1 <- transform(df, col1 = pmin(V1, V2), col2 = pmax(V1, V2))
df[!duplicated(df1[3:4]), ]

data

df <- structure(list(V1 = c(1L, 1L, 1L, 2L, 4L, 2L), V2 = c(2L, 3L, 
4L, 4L, 2L, 1L)), class = "data.frame", row.names = c(NA, -6L))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
0

Another, base R, solution is by using rowSumsand duplicated:

df[!duplicated(rowSums(df)),]
  V1 V2
1  1  2
2  1  3
3  1  4
4  2  4
Chris Ruehlemann
  • 20,321
  • 4
  • 12
  • 34