7

I'm trying to calculate the clusters of a network using igraph in R, where all nodes are connected. The plot seems to work OK, but then I'm not able to return the correct groupings from my clusters.

In this example, the plot shows 4 main clusters, but in the largest cluster, not all nodes are connected:

enter image description here

I would like to be able to return the following list of clusters from this graph object:

[[1]]
[1] 8 9

[[2]]
[1]  7 10

[[3]]
[1]  4  6 11

[[4]]
[1] 2 3 5

[[5]]
[1]  1  3  5 12

Example code:

library(igraph)

topology <- structure(list(N1 = c(1, 3, 5, 12, 2, 3, 5, 1, 2, 3, 5, 12, 4, 
6, 11, 1, 2, 3, 5, 12, 4, 6, 11, 7, 10, 8, 9, 8, 9, 7, 10, 4, 
6, 11, 1, 3, 5, 12), N2 = c(1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 
3, 4, 4, 4, 5, 5, 5, 5, 5, 6, 6, 6, 7, 7, 8, 8, 9, 9, 10, 10, 
11, 11, 11, 12, 12, 12, 12)), .Names = c("N1", "N2"), row.names = c(NA, 
-38L), class = "data.frame")

g2 <- graph.data.frame(topology, directed=FALSE)
g3 <- simplify(g2)
plot(g3)

The cliques function gets me part of the way there:

tmp <- cliques(g3)
tmp

but, this list also gives groupings where not all nodes connect. For example, this clique includes the nodes 1,2,3,5 but 1 only connects to 3, and 2 only connects to 3 and 5, and 5 only connects to 2 :

topology[tmp[[31]],]
#  N1 N2
#6  3  2
#7  5  2
#8  1  3

Thanks in advance for any help.

tonytonov
  • 25,060
  • 16
  • 82
  • 98
Marc in the box
  • 11,769
  • 4
  • 47
  • 97
  • What would you like returned in the case where there was no node 12 in that cluster, but all other connections in it remained the same? (I won't be able to work on this right now, but think that clarification might help anyone who does give this a shot.) – Josh O'Brien May 15 '14 at 19:20
  • @JoshO'Brien - THanks for your help: I believe I would want a cluster of 1,3,5 and one of 2,3,5. – Marc in the box May 15 '14 at 19:53
  • 1
    So in a simple loop of 5 nodes, it follows that you'd likewise want all five connected pairs to be returned. – Josh O'Brien May 15 '14 at 19:57

1 Answers1

3

You could use maximal.cliques in the igraph package. See below.

# Load package
library(igraph)

# Load data
topology <- structure(list(N1 = c(1, 3, 5, 12, 2, 3, 5, 1, 2, 3, 5, 12, 4, 
6, 11, 1, 2, 3, 5, 12, 4, 6, 11, 7, 10, 8, 9, 8, 9, 7, 10, 4, 
6, 11, 1, 3, 5, 12), N2 = c(1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 
3, 4, 4, 4, 5, 5, 5, 5, 5, 6, 6, 6, 7, 7, 8, 8, 9, 9, 10, 10, 
11, 11, 11, 12, 12, 12, 12)), .Names = c("N1", "N2"), row.names = c(NA, 
-38L), class = "data.frame")

# Get rid of loops and ensure right naming of vertices
g3 <- simplify(graph.data.frame(topology[order(topology[[1]]),],directed = FALSE))

# Plot graph
plot(g3)

# Calcuate the maximal cliques
maximal.cliques(g3)

# > maximal.cliques(g3)
# [[1]]
# [1] 9 8
# 
# [[2]]
# [1] 10  7
# 
# [[3]]
# [1] 2 3 5
# 
# [[4]]
# [1]  6  4 11
# 
# [[5]]
# [1] 12  1  5  3
majom
  • 7,863
  • 7
  • 55
  • 88
  • 1
    A workaround to the labeling is to order the data frame: `simplify(graph.data.frame(topology[order(topology[[1]]),],directed = FALSE))` – Blue Magister May 15 '14 at 20:12
  • Excellent work! Can you explain why the reordering of the `topology` object gets rid of loops? – Marc in the box May 16 '14 at 03:02
  • @ Marc in the box: The reordering does not get rid of loops. It is just that the function already calculate the clusters you want to have (i.e. the maximum sized cliques of all possible cliques). I am sure there are loops in the C++ implementation of this function in `igraph`. – majom May 16 '14 at 13:34
  • can someone please take a look at my question? https://stackoverflow.com/questions/64997562/r-connecting-graphs thanks – stats_noob Nov 25 '20 at 05:10