1

I am using the cluster_infomap function from igraph in R to detect communities in a undirected, unweighted, network with ~19,000 edges, but I get a different number of communities each time I run the function. This is the code I am using:

   clusters <- list()
   clusters[["im"]] <- cluster_infomap(graph)
   membership_local_method <- membership(clusters[["im"]])
   length(unique(membership_local_method))

The result of the last line of code ranges from 805-837 in the tests I have performed. I tried using set.seed() in case it was an issue of random number generation, but this does not solve the problem.

My questions are (1) why do I get different communities each time, and (2) is there a way to make it stable?

Thanks!

asmac
  • 13
  • 5
  • There's no way to tell from the info provided - can you make a reproducible example that gives different results on each run? – thelatemail Dec 21 '16 at 04:17
  • Please hover over the R tag - it asks for a minimal reproducible example. [Here's a guide](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example#answer-5963610); also look at the R help files (e.g. `?cluster_infomap`, _examples_ section) and answers of regular posters (click the R tag). After that, edit & improve your question accordingly. A good one usually provides minimal input data, the desired output data, code tries incl required packages - all copy-paste-run'able in a new/clean R session. *Why?* It makes it easier for all to follow and participate. – lukeA Dec 21 '16 at 09:40

1 Answers1

4

cluster_infomap (see ?igraph::cluster_infomap for help) finds a

community structure that minimizes the expected description length of a random walker trajectory

Whenever you deal with random number generation, then you get different results on each run. Most of the time, you can override this by setting a seed using set.seed (see ?Random for help) beforehand:

identical(cluster_infomap(g), cluster_infomap(g))
# [1] FALSE
identical({set.seed(1);cluster_infomap(g)},{set.seed(1);cluster_infomap(g)})
# [1] TRUE

or graphically:

library(igraph)
set.seed(2)
g <- ba.game(150)
coords <- layout.auto(g)
par(mfrow=c(2,2))

# without seed: different results
for (x in 1:2) {
  plot(
    cluster_infomap(g), 
    as.undirected(g), 
    layout=coords, 
    vertex.label = NA, 
    vertex.size = 5
  )
}

# with seed: equal results
for (x in 1:2) {
  set.seed(1)
  plot(
    cluster_infomap(g), 
    as.undirected(g), 
    layout=coords, 
    vertex.label = NA, 
    vertex.size = 5
  )
}
lukeA
  • 53,097
  • 5
  • 97
  • 100
  • Thanks for your detailed answer! I see that I need to set.seed right before calling the cluster_infomap function. That solves the problem. Thanks! – asmac Dec 21 '16 at 21:34