How can I create a group for nodes and links and color them accordingly on Sankey plots using networkD3 in R? this excellent example shows the steps on data formatting. Here is the code and plot from the example there, I want to add color by groups in this plot.
df <- read.table(header = TRUE, stringsAsFactors = FALSE, text = '
name year1 year2 year3 year4
Bob Hilton Sheraton Westin Hyatt
John "Four Seasons" Ritz-Carlton Westin Sheraton
Tom Ritz-Carlton Westin Sheraton Hyatt
Mary Westin Sheraton "Four Seasons" Ritz-Carlton
Sue Hyatt Ritz-Carlton Hilton Sheraton
Barb Hilton Sheraton Ritz-Carlton "Four Seasons"
')
Format dataframe and create Sankey plot
links <-
df %>%
mutate(row = row_number()) %>% # add a row id
pivot_longer(-row, names_to = "column", values_to = "source") %>% # gather all columns
mutate(column = match(column, names(df))) %>% # convert col names to col ids
group_by(row) %>%
mutate(target = lead(source, order_by = column)) %>% # get target from following node in row
ungroup() %>%
filter(!is.na(target)) # remove links from last column in original data
links <-
links %>%
mutate(source = paste0(source, '_', column)) %>%
mutate(target = paste0(target, '_', column + 1)) %>%
select(source, target)
nodes <- data.frame(name = unique(c(links$source, links$target)))
nodes$label <- sub('_[0-9]*$', '', nodes$name) # remove column id from node label
links$source_id <- match(links$source, nodes$name) - 1
links$target_id <- match(links$target, nodes$name) - 1
links$value <- 1
library(networkD3)
sankeyNetwork(Links = links, Nodes = nodes, Source = 'source_id',
Target = 'target_id', Value = 'value', NodeID = 'label')