0

I would like to know if I can find so-called n-cliques in an igraph object. Those are defined as "a maximal subgraph in which the largest geodesic distance between any two nodes is no greater than n" according to Wasserman & Faust. I'm aware that cliques of n=1 can be found via cliques() and that the sizes of cliques can be defined beforehand, but is there any way to find cliques of n larger than 1?

supersambo
  • 811
  • 1
  • 9
  • 25
  • In the future, please supplement your R questions with a minimal reproducible example (hover over R tag). [Here's how](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example#answer-5963610). – lukeA Oct 17 '16 at 17:02
  • Thank you very much for the advice! I wasn't sure about the usefulness of an example in this case. Since I'm not reporting on a potential bug or any unexpected behavior but only asking for a specific feature, I don't know how a meaningful example would look like. – supersambo Oct 18 '16 at 07:52
  • You're welcome. It could be as simple as `library(igraph);set.seed(1); g <- random.graph.game(100, p.or.m = 300, type = "gnm")`. Based on that graph, you could illustrate what you expect `magicCliqueFun` to return. That way, all are talking about the same basis. In addition, it increases the chances of getting responses, because R Stackoverflow folks like to spot code instead of digging through heaps of texts. (In fact, posts without reproducible code are not very welcome afaik.) – lukeA Oct 18 '16 at 08:05
  • 1
    Ok got it. This also explains the low response rate of some of my other questions :) – supersambo Oct 18 '16 at 08:18

3 Answers3

0

In theory, you could try RBGL::kCliques:

library(igraph)
library(RBGL)
set.seed(1)
g <- random.graph.game(100, p.or.m = 300, type = "gnm")
coords <- layout.auto(g)
cl <- kCliques(igraph.to.graphNEL(g))

k <- 2
clSel <- cl[[paste0(k, '-cliques')]][[1]] # select first of all k-cliques (e.g.)

plot(
  g, 
  layout = coords,
  vertex.shape = "none",
  vertex.label.color = ifelse(V(g) %in% clSel, "red", "darkgrey"), 
  edge.color = ifelse(tail_of(g, E(g)) %in% clSel & head_of(g, E(g)) %in% clSel, "orange", "#F0F0F099"),
  vertex.size = .5, 
  edge.curved = 1
)

However, in practice...

all(print(distances(induced_subgraph(g, clSel))) <=k ) # should be TRUE
# [1] FALSE

there seems to be something wrong if we use the definition:

In Social Network Analysis, a k-clique in a graph is a subgraph where the distance between any two nodes is no greater than k.

Or maybe I misunderstood something...

lukeA
  • 53,097
  • 5
  • 97
  • 100
0

Thanks to lukeA for pointing out RBGL::kCliques as a solution within R for this problem.

n-cliques are allowed to have links through other nodes that aren't cliques. So A -- B -- C -- D, with B -- E and C -- E as well can be a 2-clique if A and D are linked through another node, F, even though F is not in the 2-clique (since it is 3 away from E). See http://faculty.ucr.edu/~hanneman/nettext/C11_Cliques.html#nclique

n-clans are not allowed to have this behavior, however; all paths must pass through members of the subgraph to count. lukeA's test therefore demonstrates that the n-cliques are not all n-clans.

You could construct a function that outputs n-clans by throwing out all subgraphs in which the paths aren't fully within the subgraph, e.g.,

nclan <- function(g,n){
  g <- as.undirected(g)
  E(g)$weight <- 1 #just in case g has weights - does not modify original graph
  ncliques <- kCliques(ugraph(igraph.to.graphNEL(g))) #get cliques
  n.cand <- ncliques[[n]] #n-clique candidates to be an n-clan
  n.clan <- list() #initializes a list to store the n-clans
  n.clan.i <- 1 #initializes a list pointer
  for (n.cand.i in 1:length(n.cand)){ #loop over all of the candidates
    g.n.cand <- induced_subgraph(g,n.cand[[n.cand.i]]) #get the subgraph
    if (diameter(g.n.cand)<=n){ #check diameter of the subgraph
      n.clan[[n.clan.i]] <- n.cand[[n.cand.i]] #add n-clan to the list
      n.clan.i <- n.clan.i+1 #increment list pointer
    }
  }
  return(n.clan) #return the entire list
}

The removal of edge weights is due to an odd bug in RBGL's kCliques implementation. Similarly, you can write a k-plex function:

kplex <- function(g,k,m){
  g.sym <- as.undirected(g) #to make sure that degree functions properly
  g.sym.degmk <- induced_subgraph(g.sym,igraph::degree(g.sym)>=(m-k)) #makes algorithm faster
  k.cand <- combn(V(g.sym.degmk)$name,m) #all candidate combinations with m members
  k.plex <- list() #initializes a list to store the k-plexes
  k.plex.i <- 1 #initializes a list pointer
  for (k.cand.i in 1:dim(k.cand)[2]){ #loop over all of the columns
    g.k.cand <- induced_subgraph(g.sym.degmk,k.cand[,k.cand.i]) #get the subgraph
    if (min(igraph::degree(g.k.cand))>=(m-k)){ #if minimum degree of sugraph is > m=k, k-plex!
      k.plex[[k.plex.i]] <- k.cand[,k.cand.i] #add k-plex to list
      k.plex.i <- k.plex.i+1 #increment list pointer
    }
  }
  return(k.plex) #return the entire list
}
0

You can use the connect.neighborhood() igraph function to connect each vertex all all others no more than distance k away. Then you can find cliques in the resulting graph. This will give you the "k-cliques", as you defined them.


Those are defined as "a maximal subgraph ...

I'm aware that cliques of n=1 can be found via cliques()

Careful here. cliques() finds all cliques, both maximal and non-maximal. max_cliques() finds only maximal cliques. Choose the one that is appropriate for your application.

Szabolcs
  • 24,728
  • 9
  • 85
  • 174