I am trying to make an interactive Sankey with the networkd3 package. I have a dataset with eight columns.
df <- read.csv(header = TRUE, as.is = TRUE, text = '
clientcode,year1,year2,year3,year4,year5,year6,year7
1,DBC,DBBC,DBBC,DBC,DBC,"Not in care","Not in care"
2,DBC,DBBC,DBBC,"Not in care","Not in care","Not in care","Not in care"
3,DBC,DBBC,"Not in care","Not in care","Not in care","Not in care","Not in care"
4,DBC,DBBC,"Not in care","Not in care","Not in care","Not in care","Not in care"
5,DBC,DBBC,DBBC,"Not in care","Not in care","Not in care","Not in care"
')
I am using the code below in this post starting with "This question comes up a lot...": https://stackoverflow.com/a/52237151/4389763
This is the code I have:
df <- df %>% select(year1,year2,year3,year4,year5,year6,year7)
links <-
df %>%
mutate(row = row_number()) %>%
gather('column', 'source', -row) %>%
mutate(column = match(column, names(df))) %>%
group_by(row) %>%
arrange(column) %>%
mutate(target = lead(source)) %>%
ungroup() %>%
filter(!is.na(target))
links <-
links %>%
mutate(source = paste0(source, '_', column)) %>%
mutate(target = paste0(target, '_', column + 1)) %>%
select(source, target)
nodes <- data.frame(name = unique(c(links$source, links$target)))
links$source <- match(links$source, nodes$name) - 1
links$target <- match(links$target, nodes$name) - 1
links$value <- 1
nodes$name <- sub('_[0-9]+$', '', nodes$name)
library(networkD3)
library(htmlwidgets)
sankeyNetwork(Links = links, Nodes = nodes, Source = 'source',
Target = 'target', Value = 'value', NodeID = 'name')
But I don't know how to add the value of the flow. For example from DBC to DBBC occurs five times in year1 to year2. And DBBC to DBBC occurs three times from year2 to year3. With the code above I see every occurance as 1 and I would like to see the total value of a flow.
Like this example of a Sankey. Where you can see the total of for example group_A to group_C and not every occurance.
And is it possible to see the percentages in the mouse over? For example Year1 = DBC to Year2 = DBBC value is 5 out of 5 and percentage is 100%.
Can someone help me? Thank you.