4

I am working with a list of edges in R:

data <- structure(list(var1 = c("a", "b", "c", "d", "f", "g", "h"), var2 = c("b",       
"c", "a", "e", "g", "h", "i")), class = "data.frame", row.names = c(NA,         
-7L))  
> data
  var1 var2
1    a    b
2    b    c
3    c    a
4    d    e
5    f    g
6    g    h
7    h    i

I derived an igraph object from it:

library(igraph)

a <- graph_from_data_frame(data)

> a
IGRAPH 4cd4c06 DN-- 9 7 --                                                      
+ attr: name (v/c)                                                              
+ edges from 4cd4c06 (vertex names):                                            
[1] a->b b->c c->a d->e f->g g->h h->I 

and I have to test whether I have all the combinations between the vertices for every community in my data. I know every community should have nC2 edges, where n represents the number of nodes in the community, but I am not sure on how to do it with igraph.

In the example above, community 1 and 2 should be valid, as they have all the contribution between vertices, while community 3 shouldn't.

How do I test this?

As the desired output, ideally I would like to have something like this:

> data2
  var1 var2 valid
1    a    b  TRUE
2    b    c  TRUE
3    c    a  TRUE
4    d    e  TRUE
5    f    g FALSE
6    g    h FALSE
7    h    i FALSE

or anything that would allow me to identify the incomplete pairs.

Thanks!

Giulio Centorame
  • 678
  • 4
  • 19

2 Answers2

3

You can use membership like below

data %>%
  mutate(grp = membership(components(a))[var1]) %>%
  group_by(grp) %>%
  mutate(valid = choose(n_distinct(c(var1, var2)), 2) == n()) %>%
  ungroup()

which gives

# A tibble: 7 × 4
  var1  var2    grp valid
  <chr> <chr> <dbl> <lgl>
1 a     b         1 TRUE
2 b     c         1 TRUE
3 c     a         1 TRUE
4 d     e         2 TRUE
5 f     g         3 FALSE
6 g     h         3 FALSE
7 h     i         3 FALSE

where grp indicates how the vertices are clustered.

ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
2

The following does not output the expected format in the question but it does return a FALSE/TRUE vector by community.

data <- structure(list(
  var1 = c("a", "b", "c", "d", "f", "g", "h"), 
  var2 = c("b","c", "a", "e", "g", "h", "i")), 
  class = "data.frame", row.names = c(NA,-7L)) 

suppressPackageStartupMessages(
  library(igraph)
)

a <- graph_from_data_frame(data)

cmp <- components(a)
tapply(names(cmp$membership), cmp$membership, FUN = \(v) {
  cmb <- combn(v, 2, \(w) distances(as.undirected(a), v = w[1], to = w[2]))
  all(cmb == 1)
})
#>     1     2     3 
#>  TRUE  TRUE FALSE

Created on 2022-10-13 with reprex v2.0.2

Rui Barradas
  • 70,273
  • 8
  • 34
  • 66