I'm not sure if I phrased my question properly, so let me give an simplified example:
Given a dataset as follows:
dat <- data_frame(X = c("A", "B", "B", "C", "A"),
Y = c("B", "A", "C", "A", "C"))
how can I compute a pair
variable, so that it represents whatever was within X
and Y
at a given row BUT not generating duplicates, as here:
dat$pair <- c("A-B", "A-B", "B-C", "C-A", "C-A")
dat
# A tibble: 5 × 3
X Y pair
<chr> <chr> <chr>
1 A B A-B
2 B A A-B
3 B C B-C
4 C A C-A
5 A C C-A
I can compute a pairing with paste0 but it will indroduce duplicates (C-A
is the same as A-C
for me) that I want to avoid:
> dat <- mutate(dat, pair = paste0(X, "-", Y))
> dat
# A tibble: 5 × 3
X Y pair
<chr> <chr> <chr>
1 A B A-B
2 B A B-A
3 B C B-C
4 C A C-A
5 A C A-C