exchange two columns and remove the duplicates in a data frame using R

Question

Here is an example to explain what I want to do. I have a data frame like:

I want to change it to another format:

X1    Y1    X2    Y2
1     1     1     1
1     2     2     1
1     3     3     1
......

For two rows in the first table, say X=1, Y=2 and X=2, Y=1. They just exchange each other's values. So I want to put such rows in on row, as shown in the second table, and then remove the duplicates. So, the 'thin and long' table is turned to 'short and fat'. I know how to do it using two for loops. But in R, such operation takes for ever. So, can anyone help me with a quick way?

Here is a smallest example:

The original table is:

X    Y
1    2
2    1

The transferred table that I want is like:

X1    Y1     X2    Y2
1     2      2     1

So, the rows in the first table that just exchanges values are integrated into one row in the second table and the extra row in the first table is removed.

This seems to be coming up regularly these days - see https://stackoverflow.com/questions/46943753/r-data-table-duplicate-rows-with-a-pair-of-columns?noredirect=1&lq=1 and https://stackoverflow.com/questions/25145982/extract-unique-rows-from-a-data-table-with-each-row-unsorted?noredirect=1&lq=1 — thelatemail, Dec 04 '17 at 22:28
And https://stackoverflow.com/questions/25297812/pair-wise-duplicate-removal-from-dataframe/25298863 — thelatemail, Dec 04 '17 at 22:36
I'm a little unclear on what you want as a result. Does `df[!duplicated(cbind(do.call(pmax,df), do.call(pmin,df))),]` give you your intended result? — thelatemail, Dec 04 '17 at 22:45
Thanks. I think this is not what I need. But examples that you provide is very help. I am reading it. At least I know how to remove the 'duplicates'. Then it is easy to solve. — Feng Chen, Dec 04 '17 at 23:07

score 0 · Answer 1 · answered Dec 27 '19 at 20:57

Maybe the code below in base R can work

dfout <- `names<-`(cbind(r <- subset(df,df$Y>=df$X),rev(r)),
          c("X1","Y1","X2","Y2"))

such that

> dfout
  X1 Y1 X2 Y2
1  1  1  1  1
2  1  2  2  1
3  1  3  3  1
5  2  2  2  2
6  2  3  3  2
9  3  3  3  3

DATA

df <- structure(list(X = c(1, 1, 1, 2, 2, 2, 3, 3, 3), Y = c(1, 2, 
3, 1, 2, 3, 1, 2, 3)), class = "data.frame", row.names = c(NA, 
-9L))

score -1 · Answer 2 · answered Dec 04 '17 at 22:29

-1

library(tidyverse)   
df <- tibble(x1 = 1, 1, 1, 2, 2, 2, 3, 3, 3,
            y1 = 1, 2, 3, 1, 2, 3, 1, 2, 3)
df <- df %>% mutate(x2 = y1, y2 = x1) %>% distinct()

I think this does the trick.

answered Dec 04 '17 at 22:29

jandraor

187
1
10

exchange two columns and remove the duplicates in a data frame using R

2 Answers2