Visualizing the result of dividing the network into communities

Question

The dataset is included the network matrix and attribute data frame. Network dataset has 3 data set itself, that I just want to work on PrinFull dataset and also just PRIN attribute data. my data is uploaded in this two link below. I added all attribute on my data set.

https://drive.google.com/file/d/1MZCdeAZF0joIQLwVeoVXmKpf7r8IJ2wq/view?usp=sharing https://drive.google.com/file/d/1I96BAUo8TjJMWCWpn_SIhp54snfZ0Bd5/view?usp=sharing I want to plot my community detection algorithm, the code is as below , but my plot is messy and not understandable. how can I plot in a better way? can anyone help me?

load('/content/CISPRINWOSmatrices.RData')
load('/content/CISPRINWOS_attributes.RData')

library("igraphdata")
library("igraph")
library("network")
library("statnet")
library("intergraph")
library("dplyr")
library("stringr")
library("RColorBrewer")
library("sand")




nodePRIN <- data.frame(PRIN)
#nodePRIN
relationsp <- as.matrix(PrinFull)

PRIN_graph = graph_from_adjacency_matrix(relationsp, mode="undirected",weighted = TRUE)
PRIN_graph

# Girvan-newman algorithm
gn.comm <- cluster_edge_betweenness(PRIN_graph)

#How many communities?

unique(gn.comm$membership)

#attach community labels as vertex attribute
V(PRIN_graph)$GN.cluster <- membership(gn.comm)
PRIN_graph

V(PRIN_graph)$Author[V(PRIN_graph)$GN.cluster==69]
# visualizing the result of dividing the network into communities

par(mar=c(0,0,0,0))

colors <- rainbow(max(membership(gn.comm)))
plot(gn.comm, PRIN_graph, vertex.size = 6, 
vertex.color=colors[membership(gn.comm)], vertex.label = NA, edge.width = 1)

[![enter image description here][1]][1]

score 3 · Accepted Answer · answered Jun 26 '20 at 15:24

Nothing that you can do will make it easy to see 2839 nodes with 9379 links. There just isn't that much space on the screen. Nevertheless, I have some suggestions that may provide more insight than just passing the graph into plot.

First, a quick glance at your plot reveals that this graph is not composed of a single connected component.

COMP = components(PRIN_graph)
table(COMP$membership)
   1    2    3    4    5    6    7    8    9   10   11   12   13   14   
2696   42    2    4   18   13    2    7    7    2    3    2    2    2   
  15   16   17   18   19    20   21   22   23   24   25   26   27 
   2    6   14    3    1     1    1    2    1    3    1    1    1

So 2696 of the nodes are in a single large component and the remaining 143 are in 26 small components. The 2696 nodes in the big component overwhelm the smaller components and the 26 small components acts as visual clutter for the big component. Let's separate the 26 small components.

SC = which(COMP$membership != 1)
SmallComps = induced_subgraph(PRIN_graph, SC)

Now it is easy to see the community structure on all of these small components.

SC.gn.comm <- cluster_edge_betweenness(SmallComps)
colors <- rainbow(max(membership(SC.gn.comm)))
plot(SC.gn.comm, SmallComps, vertex.size = 6, 
    vertex.color=colors[membership(SC.gn.comm)], 
    vertex.label = NA, edge.width = 1)

Mostly, small components comprised of a single community, although there are a few with some structure.

That was the easy part, now let's look at the big component.

LC = which(COMP$membership == 1)
LargeComp = induced_subgraph(PRIN_graph, LC)

Girvan-Newman produces 43 communities within this large component

LC.gn.comm <- cluster_edge_betweenness(LargeComp)
max(LC.gn.comm$membership)
[1] 43

But simply plotting that still leaves a mess.

par(mar=c(0,0,0,0))
colors <- rainbow(max(membership(LC.gn.comm)))
set.seed(1234)
plot(LC.gn.comm, LargeComp, vertex.size = 6, 
    vertex.color=colors[membership(LC.gn.comm)], 
    vertex.label = NA, edge.width = 1)

I will suggest two ways to improve the appearance of this graph:
separating the communities and contracting the communities.

Separating Communities

Based on this previous answer, we can position vertices in the same community group together and make different communities stay further apart.

LC_Grouped = LargeComp
E(LC_Grouped)$weight = 1
for(i in unique(membership(LC.gn.comm))) {
    GroupV = which(membership(LC.gn.comm) == i)
    LC_Grouped = add_edges(LC_Grouped, combn(GroupV, 2), attr=list(weight=6))
} 

set.seed(1234)
LO = layout_with_fr(LC_Grouped)
colors <- rainbow(max(membership(LC.gn.comm)))
par(mar=c(0,0,0,0))
plot(LC.gn.comm, LargeComp, layout=LO,
    vertex.size = 6, 
    vertex.color=colors[membership(LC.gn.comm)], 
    vertex.label = NA, edge.width = 1)

This makes the communities stand out better, but it is still pretty hard to see the relationships. So another option is

Contract the Communities

Just plot a single node for each community. Here, I make the area of each community vertex proportional to the number of members of that community and I colored the vertices using a coarse grouping based on their degrees.

GN.Comm = simplify(contract(LargeComp, membership(LC.gn.comm)))
D = unname(degree(GN.Comm))

set.seed(1234)
par(mar=c(0,0,0,0))
plot(GN.Comm, vertex.size=sqrt(sizes(LC.gn.comm)),
    vertex.label=1:43, vertex.cex = 0.8,
    vertex.color=round(log(D))+1)

You can see that some communities barely connect to any others and some are very well connected. None of these visualizations are perfect, but I hope that they might provide some insight into the structure and relationships.

thanks alot, when I want to run code for small component, it gives me this error: you know why? — eli, Jun 27 '20 at 10:15
Error in if ((n <- as.integer(n[1L])) > 0) {: missing value where TRUE/FALSE needed Traceback: 1. rainbow(max(membership(SC.gn.comm))) — eli, Jun 27 '20 at 10:15
I really dont know why you did not get error, but with the same code i get error — eli, Jun 27 '20 at 10:28
I do not get any error. Which part throws the error - i.e. try running just `max(membership(SC.gn.comm))` to see if that causes the error or if it is only once you run `rainbow`. — G5W, Jun 27 '20 at 12:10
i tried your code,with little change,it helps me alot, thank you so much\ — eli, Jun 27 '20 at 13:11
@G5W: I am working on a similar problem (https://stackoverflow.com/questions/64690623/r-how-to-efficiently-visualize-a-large-graph-network). I am interested in visualizing a large network graph with R (and eventually do community detection). When I just plot the graph network of several thousand points (even before community detection), the graph is so cluttered and crowded you can not make sense of it. Could I use the same logic and just make a subgraph with nodes having some minimum number of connections (e.g 5)? Thank you! — stats_noob, Nov 07 '20 at 06:34
@G5W: is there an optimal environment in which graph clustering should be used? E.g. https://stackoverflow.com/questions/64849921/r-k-means-clustering-vs-community-detection-algorithms-weighted-correlation-ne in this example, does it make sense to use graph clustering methods on a a weighted correlation network? — stats_noob, Nov 15 '20 at 21:39

Visualizing the result of dividing the network into communities

1 Answers1

Linked