Remove rows if the swap also exist in the data frame in R

Question

I am trying to remove those rows if the swap also exists in the data frame.

For example, if I have a data frame:

Then the row (1,2), (2,4) will be removed because (2,1) and (4,2) are also in the df. Is there any fast and neat way to do it? Thank you!

Can same row be repeated twice? For eg - (1, 2) and (1, 2) ? — Ronak Shah, Jul 18 '20 at 03:46
no, if (1, 2) is in the list then (2,1) can not be in the list — DigiPath, Jul 18 '20 at 03:48

score 1 · Accepted Answer · answered Jul 18 '20 at 03:49

1

You can row-wise sort the columns and then select only the unique ones :

library(dplyr)

df %>%
 mutate(col1 = pmin(V1, V2), 
        col2 = pmax(V1, V2)) %>%
 distinct(col1, col2)

#  col1 col2
#1    1    2
#2    1    3
#3    1    4
#4    2    4

Using base R :

df1 <- transform(df, col1 = pmin(V1, V2), col2 = pmax(V1, V2))
df[!duplicated(df1[3:4]), ]

data

df <- structure(list(V1 = c(1L, 1L, 1L, 2L, 4L, 2L), V2 = c(2L, 3L, 
4L, 4L, 2L, 1L)), class = "data.frame", row.names = c(NA, -6L))

answered Jul 18 '20 at 03:49

Ronak Shah

377,200
20
156
213

I like the dplyr solution, it is pretty fast! Thank you! – DigiPath Jul 18 '20 at 03:53
No, wait. Do you want to remove both the original and the swap? meaning (1, 2) and (2, 1) ? My answer only removes (2, 1). – Ronak Shah Jul 18 '20 at 03:54
In case if you want to remove both, you can use `df[!(duplicated(df1[3:4]) | duplicated(df1[3:4], fromLast = TRUE)), ]` – Ronak Shah Jul 18 '20 at 04:05
No I mean remain either (1,2) or (2,1) so I think your answer meets my need. Thank you! – DigiPath Jul 18 '20 at 04:29

score 0 · Answer 2 · answered Jul 18 '20 at 07:12

0

Another, base R, solution is by using rowSumsand duplicated:

df[!duplicated(rowSums(df)),]
  V1 V2
1  1  2
2  1  3
3  1  4
4  2  4

answered Jul 18 '20 at 07:12

Chris Ruehlemann

20,321
4
12
34

Remove rows if the swap also exist in the data frame in R

2 Answers2