I set up a correlation matrix in Excel. I then read it into R and do some adjustments to plot the Community Detection in the end. This works just fine.
To better adjust the plot, I use the function tkplot(). I get the error message:
Error in col[idx] <- substr(col[idx], 1, 7) : NAs are not allowed in subscripted assignment
I have a matrix of 49x49. I found out that the problem is created by a few numbers in this matrix. If I set them to 1, tkplot() works. Upon restarting R, these numbers are not the same anymore. I have to manually find them in the Excel by changing some to 1 and then check if the code works in R.
I am a novice in R - please apologize if someone has already answered this question in another thread.
My code is the following:
# Read Excel
correlationmatrix_Sustainability <- read_excel("Official Research.xlsx",
sheet = "Corr_Sustainability")
# Delete first column and rename rows
correlationmatrix_Sustainability <- correlationmatrix_Sustainability[,-1]
column_names_Sustainability <- colnames(correlationmatrix_Sustainability)
rownames(correlationmatrix_Sustainability) <- column_names_Sustainability
# Work the magic
distancematrix_Sustainability <- cor2dist(correlationmatrix_Sustainability)
# If instead, you do not want variables with negative correlation to be connected,
# just get rid of the absolute value above. This should be much less connected
DM2_Sustainability <- as.matrix(distancematrix_Sustainability)
## Zero out connections where there is low correlation
DM2_Sustainability[correlationmatrix_Sustainability < 0] = 0
# Correlation matrix as long list
matrix(DM2_Sustainability, dimnames=list(t(outer(colnames(DM2_Sustainability), rownames(DM2_Sustainability), FUN=paste)), NULL))
# Number of companies per node
nodes_Sustainability <- read_excel("Official Research.xlsx",
sheet = "Nodes_Sustainability", col_names = TRUE)
# Edges: Correlation matrix as long list
correlationmatrix_Sustainability[correlationmatrix_Sustainability < 0] = 0
correlation_Sustainability <- as.data.frame(correlationmatrix_Sustainability)
rownames(correlation_Sustainability) <- column_names_Sustainability
correlation_Sustainability$rownames <- rownames(correlation_Sustainability)
edges_Sustainability <- melt(correlation_Sustainability,id.vars = "rownames")
colnames(edges_Sustainability) <- c("1","2","weights")
edges_Sustainability <- edges_Sustainability[edges_Sustainability$weights>0, ]
edges_Sustainability <- edges_Sustainability[complete.cases(edges_Sustainability$weights),]
graph_Sustainability <- graph_from_data_frame(d=edges_Sustainability,vertices = nodes_Sustainability,directed = FALSE)
# Louvain Method for community detection
clusterlouvain_Sustainability <- cluster_louvain(graph_Sustainability,weights = edges_Sustainability$weights)
# Change width of arrows based on correlation weight
scaling_factor <- 25 # Define a scaling factor (adjust this according to your preference)
E(graph_Sustainability)$width <- E(graph_Sustainability)$weights*scaling_factor
# Change size of nodes based on company count
V(graph_Sustainability)$size <- V(graph_Sustainability)$Count
# Plot graph
plot(graph_Sustainability, layout=layout_nicely, vertex.color=rainbow(5, alpha=0.6)
[clusterlouvain_Sustainability$membership],edge.width=E(graph_Sustainability)$width,
red=100) # Use res to increase resolution
# Get rid of unnecessary edges
cut.off <- 0.4 # Get rid of all edges with correlation < 0.1
graph_Sustainability.reduced <- delete_edges(graph_Sustainability, E(graph_Sustainability)[weights<cut.off])
plot(graph_Sustainability.reduced, layout=layout_nicely, vertex.color=rainbow(5, alpha=0.6)
[clusterlouvain_Sustainability$membership],edge.width=E(graph_Sustainability)$width)
# Change placement of nodes myself
# Use Fruchterman-Reingold or Kamada-Kawai
tkplot(graph_Sustainability, layout=layout_nicely, vertex.color=rainbow(5, alpha=0.6) # Not reduced
[clusterlouvain_Sustainability$membership],edge.width=E(graph_Sustainability)$width,
red=100)
tkplot(graph_Sustainability.reduced, layout=layout_nicely, vertex.color=rainbow(5, alpha=0.6) # Reduced
[clusterlouvain_Sustainability$membership],edge.width=E(graph_Sustainability)$width,
red=100)`
I changed all numbers of the correlation matrix to 1 and also used different functions to read the Excel. Neither worked.