1

I am trying to create an igraph object splitting a string vector on a special character ("&"). I use a for-loop to create a vector and convert that into a network graph. The code works but is extremely inefficient on very long vectors (large networks).

Is there a way to improve the process with pipes and mapping? Thanks in advance

require(graph)
data <- data.frame(nodes=c("A","A & B","C","B & C","B & D"))

V <- c()
for (i in 1:nrow(data)){
  V_temp <- data[i,]
  ifelse(grepl(" & ", data$nodes[i]),
         N <- t(combn(unlist(strsplit(data$nodes[i], " & ")),2)),
         N <- matrix(rep(data$nodes[i],2), nrow = 1, ncol = 2))
  colnames(N) <- c("N1","N2")
  V_temp <- cbind(N, V_temp, row.names = NULL)
  V <- as.data.frame(rbind(V, V_temp, row.names = NULL))
}

vector <- rbind(as.vector(as.character(V$N1)),
                as.vector(as.character(V$N2)))
plot(graph(vector, directed = FALSE))
MCS
  • 1,071
  • 9
  • 23

2 Answers2

1

If you are willing to use dplyr:

library(dplyr)
d <- data %>%
      separate(nodes, c("from", "to") ) %>%
      mutate(to = coalesce(to,from))

  from to
1    A  A
2    A  B
3    C  C
4    B  C
5    B  D

Warning message:
Expected 2 pieces. Missing pieces filled with `NA` in 2 rows [1, 3]. 

g <- graph_from_data_frame(d)

separate returns a warning, telling you sometimes there is nothing to split. In the second step you fill in the NA in the column "to" with values from column "from".

You could also specify the split if you want separate(nodes, c("from", "to"), " & " ).

Split data frame string column into multiple columns

How to split column into two in R using separate

desval
  • 2,345
  • 2
  • 16
  • 23
0

Starting from the two suggestions from desval I came up with this. This works for when separate operate on one or two nodes, fails with higher numbers. For example for data <- data.frame(nodes=c("E","A & B","C","B & C","B & D & E"))

Se below updated code

library(igraph)
library(dplyr)

selfloop <- function(x){
  y <- ifelse(!grepl(" & ",x), paste(x,x, sep = " & "), x)
  return(y)
}

data <- data.frame(nodes=c("E","A & B","C","B & C","B & D"))


g <- data %>%
  mutate(nodes = selfloop(nodes)) %>%
  separate(nodes, c("from", "to"), sep = " & ") %>%
  graph_from_data_frame(directed = FALSE)

plot(g)
MCS
  • 1,071
  • 9
  • 23
  • 1
    do you need to have a vector? It is much easier to pass in a data frame with from and to columns to igraph. This allows you to keep the code clean and easy to read. Also, it will be faster, I guess. – desval Dec 19 '20 at 13:04