I'm working on a logistic data set, trying to visualise the steps taken by the subjects in the data set.
My input data has the following form: ID_object;proces
The undertaken_proces has the form: A-B|B-C|C-D...
I'm visualising in visNetwork.
For the nodes I have a predefined dataframe where all possible nodes (A-Z) are declared and given some initial properties (e.g. all nodes startout gray and have a certain label).
For the links between the nodes, I've written a loop that:
- Looks at all rows in the dataset
- Per row takes the undertaken_proces columns
- Splits the undertaken_proces by "|" to get the individual movements ( A-B, B-C, ...)
- Splits the result above by "-" to get a list of vectors where the first element is the "from-object" and the second element the "to-object"
- Loop over the list of vectors and fill one vector containing all "from-objects" and one vector containg all "to-objects".
- construct a dataframe with columns from, to based on the vector above and som additional columns for additional properties (e.d. color of the link direction of arrow)
- Per row in the parent loop I check what the last object is where the ID was and change the color for that node to another color
- After the loop grouping the links dataframe by from, to, color and arrow to obtain the total movements per grouping
This way works fine for smaller datasets, but is teribly slow/unsuble for larger ones. I'm guessing there is a way with mapping and/or apply functions, but I havent figured it out yet. Can someone point me in the right direction?
library(visNetwork)
visNetwordGridLayout <- function(x)
{
x[is.na(x)] <- 0
x <- apply(t(x), 1, rev)
LmatX <- seq(-1,1,length=ncol(x))
LmatY <- seq(1,-1,length=nrow(x))
loc <- t(sapply(1:max(x),function(y)which(x==y,arr.ind=T)))
layout <- cbind(LmatX[loc[,2]],LmatY[loc[,1]])
return(layout)
}
ID_movements <- data.frame(
ID = c(1, 2),
PROCES = c("A-B|B-C|C-E", "A-B|B-C|C-D"),
stringsAsFactors = FALSE
)
# nodes
nodes <- data.frame(id = c("A", "B", "C", "D", "E"))
nodes <- nodes %>% mutate(label = id)
nodes$color.background <- rep('grey', nrow(nodes))
# Links
links <- data.frame()
for (row in 1:nrow(ID_movements)) {
ID_movement <- ID_movements[row, ]$PROCES
procesSteps <- strsplit(ID_movement , "\\|")[[1]]
procesSteps <- strsplit(procesSteps, '-')
fromVec <- c()
toVec <- c()
for (step in procesSteps){
fromVec <- c(fromVec , step[1])
toVec <- c(toVec , step[2])
}
links <- rbind(links,
data.frame(from = fromVec,
to = toVec,
color = 'blue',
arrows = 'to')
)
lastNode <- last(toVec)
nodes$color.background[nodes$id ==
lastNode] <- 'green'
}
links <- links %>%
group_by(from, to, color, arrows) %>%
summarise(label = n()) %>%
ungroup()
# Grid waar de nodes komen
grid <- matrix(
match(
c(
"A",0,0,0,0,
0,"B",0,0,0,
0,0,"C",0,"D",
0,0,0,0,"E"
),
nodes$id),
nrow=4,byrow=TRUE)
visNetwork(nodes, links) %>%
visIgraphLayout(layout = "layout.norm",
layoutMatrix = visNetwordGridLayout(grid))