4

I would like to use the ggforce package in R to create a Sankey diagram because I prefer the look of the parallel sets plots made with ggforce to other visualizations. I have nodes that are sorted into levels from left to right. However I want some of the links to go directly from, say, level 1 to level 5 without touching nodes in between. This image shows a Sankey diagram made with networkd3 where "production" links directly to "wasted" across intervening levels. enter image description here

Is this type of diagram possible to create with ggforce? I have tried the following but it returns an error because it does not allow missing values for any of the levels.

Input data

fsc_sankey <- structure(list(stage1 = c("production", "production", "production", 
                                        "production", "production", "production", "production", "production", 
                                        "production"), stage2 = c(NA, "processing", "processing", "processing", 
                                                                  "processing", "processing", "processing", "processing", "processing"
                                        ), stage3 = c(NA, NA, "retail", NA, NA, NA, NA, "retail", "retail"
                                        ), stage4 = c(NA, NA, NA, "foodservice", "foodservice", "institutions", 
                                                      "institutions", "households", "households"), destination = c("waste", 
                                                                                                                   "waste", "waste", "consumed", "waste", "consumed", "waste", "consumed", 
                                                                                                                   "waste"), value = c(1L, 1L, 1L, 3L, 1L, 3L, 1L, 3L, 1L)), class = "data.frame", row.names = c(NA, 
                                                                                                                                                                                                                 -9L))

Code

library(tidyverse)
library(ggforce)

fsc_sankey_set <- gather_set_data(fsc_sankey, 1:5) %>%
  mutate(x = factor(x, levels = c('stage1','stage2','stage3','stage4','destination'))) 

ggplot(fsc_sankey_set, aes(x, id = id, split = y, value = value)) +
  geom_parallel_sets(alpha = 0.3, axis.width = 0.1) +
  geom_parallel_sets_axes(axis.width = 0.1) +
  geom_parallel_sets_labels(colour = 'white')
qdread
  • 3,389
  • 19
  • 36
  • Hi @qdread, do you have any updates on this question? Thanks! – amedicalenthusiast Oct 19 '22 at 10:50
  • @amedicalenthusiast unfortunately, no, I ended up abandoning my attempt to make ggforce work. I highly recommend the [ggalluvial](https://corybrunson.github.io/ggalluvial/articles/ggalluvial.html) package which can produce a diagram like the one in my original question. – qdread Oct 19 '22 at 11:57
  • 1
    would you mind sharing your codes with ggalluvial which enables skip nodes? Thanks in advance – amedicalenthusiast Oct 19 '22 at 13:18
  • Or perhaps, do you have any idea whether the same code works for ggsankey? I'm having a bit of difficulty converting my current ggsankey dataset to ggalluvial format – amedicalenthusiast Oct 19 '22 at 14:15
  • 1
    Let me look into it and get back to you, it's been a while since I have worked with this – qdread Oct 19 '22 at 14:17
  • Hi @qdread it happens that my question is already answered here https://stackoverflow.com/questions/74141362/how-to-skip-nodes-with-na-value-in-ggsankey/74141655?. Thanks for the help! – amedicalenthusiast Oct 20 '22 at 15:04

0 Answers0