1

I have this dataset:

df <- structure(list(name = structure(c(1L, 3L, 4L, 5L, 6L, 7L, 8L, 
9L, 10L, 2L), .Label = c("node1", "node10", "node2", "node3", 
"node4", "node5", "node6", "node7", "node8", "node9"), class = "factor"), 
    value = c(100L, 14L, 2L, 0L, 25L, 0L, 0L, 43L, 7L, 0L)), .Names = c("name", 
"value"), class = "data.frame", row.names = c(NA, -10L))

and I would like the nodes which have value equals to 0 have red color and the nodes with value equals or greater than one have red color and their circle be bigger depending on how large the value is.

Is it possible to make it using igraph?

Dataset with edges. Input dataframe:

EDIT from comment

I made this dataset based on books and citations. Books = nodes and citations = link. Every book is unique and has citations. The citation could be common to more than one books. That's why i.e. link1 is in multiple columns. The link44 until link100 are citations but which exist only in book1 but not in other books. Because books and citations have words as titles and in order to make a graph it could be not helpful I changed the book titles to nodes with numbering and citations to links which numbering. Citations which are common to more than one book the have the same id i.e.link1

dput(df)
structure(list(node1 = structure(c(1L, 13L, 24L, 35L, 46L, 57L, 
68L, 79L, 90L, 2L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 14L, 
15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 25L, 26L, 27L, 28L, 
29L, 30L, 31L, 32L, 33L, 34L, 36L, 37L, 38L, 39L, 40L, 41L, 42L, 
43L, 44L, 45L, 47L, 48L, 49L, 50L, 51L, 52L, 53L, 54L, 55L, 56L, 
58L, 59L, 60L, 61L, 62L, 63L, 64L, 65L, 66L, 67L, 69L, 70L, 71L, 
72L, 73L, 74L, 75L, 76L, 77L, 78L, 80L, 81L, 82L, 83L, 84L, 85L, 
86L, 87L, 88L, 89L, 91L, 92L, 93L, 94L, 95L, 96L, 97L, 98L, 99L, 
100L, 3L), .Label = c("link1", "link10", "link100", "link11", 
"link12", "link13", "link14", "link15", "link16", "link17", "link18", 
"link19", "link2", "link20", "link21", "link22", "link23", "link24", 
"link25", "link26", "link27", "link28", "link29", "link3", "link30", 
"link31", "link32", "link33", "link34", "link35", "link36", "link37", 
"link38", "link39", "link4", "link40", "link41", "link42", "link43", 
"link44", "link45", "link46", "link47", "link48", "link49", "link5", 
"link50", "link51", "link52", "link53", "link54", "link55", "link56", 
"link57", "link58", "link59", "link6", "link60", "link61", "link62", 
"link63", "link64", "link65", "link66", "link67", "link68", "link69", 
"link7", "link70", "link71", "link72", "link73", "link74", "link75", 
"link76", "link77", "link78", "link79", "link8", "link80", "link81", 
"link82", "link83", "link84", "link85", "link86", "link87", "link88", 
"link89", "link9", "link90", "link91", "link92", "link93", "link94", 
"link95", "link96", "link97", "link98", "link99"), class = "factor"), 
    node2 = structure(c(1L, 9L, 10L, 11L, 12L, 13L, 14L, 2L, 
    3L, 4L, 5L, 6L, 7L, 8L, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA), .Label = c("link1", "link10", "link11", "link12", 
    "link13", "link14", "link15", "link16", "link4", "link5", 
    "link6", "link7", "link8", "link9"), class = "factor"), node3 = structure(c(1L, 
    2L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA), .Label = c("link1", 
    "link2"), class = "factor"), node4 = c(NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA), node5 = structure(c(1L, 12L, 19L, 20L, 
    21L, 22L, 23L, 24L, 25L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 
    10L, 11L, 13L, 14L, 15L, 16L, 17L, 18L, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), .Label = c("link1", 
    "link10", "link11", "link12", "link13", "link14", "link15", 
    "link16", "link17", "link18", "link19", "link2", "link20", 
    "link21", "link22", "link23", "link24", "link25", "link3", 
    "link4", "link5", "link6", "link7", "link8", "link9"), class = "factor"), 
    node6 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), node7 = c(NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA), node8 = structure(c(1L, 
    12L, 23L, 34L, 39L, 40L, 41L, 42L, 43L, 2L, 3L, 4L, 5L, 6L, 
    7L, 8L, 9L, 10L, 11L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 
    20L, 21L, 22L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 
    33L, 35L, 36L, 37L, 38L, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA), .Label = c("link1", "link10", "link11", 
    "link12", "link13", "link14", "link15", "link16", "link17", 
    "link18", "link19", "link2", "link20", "link21", "link22", 
    "link23", "link24", "link25", "link26", "link27", "link28", 
    "link29", "link3", "link30", "link31", "link32", "link33", 
    "link34", "link35", "link36", "link37", "link38", "link39", 
    "link4", "link40", "link41", "link42", "link43", "link5", 
    "link6", "link7", "link8", "link9"), class = "factor"), node9 = structure(c(1L, 
    2L, 3L, 4L, 5L, 6L, 7L, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA), .Label = c("link1", 
    "link2", "link3", "link4", "link5", "link6", "link7"), class = "factor"), 
    node10 = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("node1", 
"node2", "node3", "node4", "node5", "node6", "node7", "node8", 
"node9", "node10"), class = "data.frame", row.names = c(NA, -100L
))

In the graph every node depicted by a circle with a diameter proportional to the number of frequency. The column names are the nodes and the links in every row are the connecting elements.

How is it possible to give to the link nodes a color of yellow and the nodes with 0 frequency a node of red?

user20650
  • 24,654
  • 5
  • 56
  • 91
Sasak
  • 189
  • 5
  • 15
  • You dont seem to have any edges in your graph? – user20650 Jun 18 '17 at 14:15
  • @user20650 yes I don't have any edges. – Sasak Jun 18 '17 at 14:17
  • 2
    okay, you can colour vertices by setting `V(g)$color` (see [this](https://stackoverflow.com/questions/15999877/correctly-color-vertices-in-r-igraph)) and adjust vertex size by setting `vertex.size` when plotting (see [this](https://stackoverflow.com/questions/12058556/adjusting-the-node-size-in-igraph-using-a-matrix)) – user20650 Jun 18 '17 at 14:20
  • @user20650 thank you. If it is more helpful I added a dataset in order to have a network of edges. – Sasak Jun 18 '17 at 15:15
  • Did you really mean that the nodes should be red _both_ when the value is zero and when it is greater than or equal to one? Or did you intend to ask for different colors? – G5W Jun 18 '17 at 16:00
  • @G5W not sure if it helps. Have different color i.e if they have a frequency have a specific color and the node be a bigger circle based on the volume of frequency. Otherwise if the have nothing (frequency equals to 0) have the gray color, example nodes of dataset in this category are node4, node6, node7 and node10. – Sasak Jun 18 '17 at 19:05
  • @Sasak ; could you explain your updated dataset df please as its format is unconventional - how are each of the ten nodes connected by these links (which number up to a hundred) – user20650 Jun 18 '17 at 20:52
  • There are various ways to represent it, but from your input it is unclear (at least to me) how the nodes are connected You have various link# under each node, but i dont see how these are meant to connect to other nodes. – user20650 Jun 19 '17 at 09:39
  • @user20650 in every column where the link1 for example exist this is the connection line. Example the link1 exist in node1,node2,node3, node5 and node8. So they have one connection the common connection is the number of link which make it unique in every column and you can find it in other column so this is a connection between the nodes. Is it helpful? thank you for your time. – Sasak Jun 19 '17 at 09:44
  • Im stumped sorry. Under node1 there are multiple links, link44 to link100, that are not under any other nodes. What edges do these represent - are they loops (ie 56 edges from node1 to node1)? (ps where idi you get the data from as it is a bit of a strange format to use to generate a graph?) – user20650 Jun 19 '17 at 09:49
  • @user20650 I made this dataset based on books and citations. Books = nodes and citations = link. Every book is unique and has citations. The citation could be common to more than one books. That's why i.e. link1 is in multiple columns. The link44 until link100 are citations but which exist only in book1 but not in other books. Because books and citations have words as titles and in order to make a graph it could be not helpful I changed the book titles to nodes with numbering and citations to links which numbering. Citations which are common to more than one book the have the same id i.e.link1 – Sasak Jun 19 '17 at 10:10
  • @user20650 so I try to find which books are connected based on the common citations they have. Does this helpful in order to understand the structure of dataset? – Sasak Jun 19 '17 at 10:11
  • Seems the first task is to arrange your data: This creates a weighted adjacency matrix for the common links between books which can then be read in to igraph: `x = as.matrix(df) ; x <- !is.na(x) ; adjMat <- crossprod(x)` – user20650 Jun 19 '17 at 15:14
  • 1
    @user20650 thank you for all your guidance in order to make my question sufficient. I though it was clear what I tried but it wasn't . If you want you could provide as answer and not as comment in order to accept it. – Sasak Jun 19 '17 at 20:51

1 Answers1

2
# As the links are in the same row index under each node they
# can be converted to 1/0 (presence of link/not) by 
# the links being missing or otherwise
# From this you can create adjacency matrix between the nodes by
# taking the crossproduct
x <- as.matrix(df)
x <- !is.na(x)
adjMat <- crossprod(x)

library(igraph)

# Read in the adjacency matrix
g = graph_from_adjacency_matrix(adjMat, mode="undirected", weighted=TRUE, diag=FALSE)

# For your original question
# set colour by setting V(g)$color attribute
# if diagonal is nonzero vertex colour is red else blue
V(g)$color <- ifelse(diag(adjMat), "red", "blue")
# Similarily for size: here vertex size is set to the number of 
# citations each book has (the diagonal of adj matrix)
V(g)$size <- diag(adjMat)

# Plot setting the edge weight equal to the number of shared links
plot(g, edge.width=E(g)$weight/2)

# If you want to remove the nodes with zero citations
# probably an igraph function to do this but you can do it manually
g1 <- g - paste0("node", which(diag(adjMat)==0))
plot(g1, edge.width=E(g1)$weight/2)
user20650
  • 24,654
  • 5
  • 56
  • 91