0

I have input data that looks like this, where columns are pairs of IDs:

dat <- data.table(ID1 = c('A','B','C','X','B','X','F','E','G','F','A'),
              ID2 = c('B','C','A','A','X','C','J','J','I','E','I'))

I would like a function to generate the highest-order sets from these pairs, i.e. sets where all members are paired in the original data. In the case of the example data, I would expect the output to be:

list(c('A','B','C','X'),
 c('E','F','J'),
 c('G','I'),
 c('A','I'))

Any ideas?

Edit: To clarify differences from related StackOverflow post thttps://stackoverflow.com/questions/12135971/identify-groups-of-linked-episodes-which-chain-together, I would like groups where relationships are not inferred using pairs, but those relationships which are supported by all pairs. In the example data, see how (A,I) is a different set than (A,B,C,X) because pairs (B,I), (C,I), and (X,I) do not exist.

Henrik
  • 65,555
  • 14
  • 143
  • 159
  • 1
    Related: https://stackoverflow.com/questions/12135971/identify-groups-of-linked-episodes-which-chain-together – MrFlick Nov 21 '22 at 14:37
  • 1
    Here's another igraph solution that gets close: `igraph::cluster_edge_betweenness(igraph::graph_from_data_frame(dat, directed=FALSE))`. – MrFlick Nov 21 '22 at 14:41

0 Answers0