0

I have data that look like this:

  X Y  TOTAL PAIR
1 A B     1   AB
2 B A     2   BA
3 C D     3   CD
4 D C     4   DC
5 E F     5   EF

I want to add a variable PAIR_ID that captures the same IDs in any combination. So the order doesn't matter. It should look like this:

  X Y  TOTAL PAIR  PAIR_ID
1 A B     1   AB       1
2 B A     2   BA       1
3 C D     3   CD       2
4 D C     4   DC       2
5 E F     5   EF       3

The goal is to get a df that has totals for each pair. So something like this:

    PAIR_ID   PAIR_TOTAL
1       1          3
2       2          7
3       3          5

My question is how to get that PAIR_ID variable. I'm stumped. Appreciate any help.

M. Clog
  • 13
  • 2

1 Answers1

0

One option is

library(dplyr)
out <- df1 %>% 
         mutate(PAIR_ID = group_indices(., pmax(X, Y), grp2 = pmin(X,Y)) )

out %>% 
    group_by(PAIR_ID) %>%
    summarise(PAIR_TOTAL = sum(TOTAL))
# A tibble: 3 x 2
#  PAIR_ID PAIR_TOTAL
#    <int>      <int>
#1       1          3
#2       2          7
#3       3          5

data

df1 <- structure(list(X = c("A", "B", "C", "D", "E"), Y = c("B", "A", 
"D", "C", "F"), TOTAL = 1:5, PAIR = c("AB", "BA", "CD", "DC", 
"EF")), class = "data.frame", row.names = c("1", "2", "3", "4", 
"5"))
akrun
  • 874,273
  • 37
  • 540
  • 662