2

I'm creating a sankey diagram using plotly and there is the built in method to use 'group' to combine nodes. However, when I use this the color of this node will be black and no label is showing. This is expected as the colors of the grouped nodes could vary. However, I don't see how I can set the color of the group. Same goes for the label. Is there a way to define this?

example code:

import plotly.graph_objs as go
from plotly.offline import plot

value = [3,5,2,4,6]
source = [0,0,1,0,3]
target = [1,4,2,3,4]
color = ["blue","yellow","orange","orange","purple"]
label = ["A","B","C1","C2","D"]


data = dict(
    type='sankey',
    arrangement = 'freeform',
    node = dict(
      pad = 15,
      thickness = 20,
      line = dict(
        color = "black",
        width = 0.1
      ),
      groups = [[2,3]],
      label = label,
      color = color,
    ),
    link = dict(
        source = source,
        target = target,
        value = value,
      )
)

layout =  dict(
    title = "Sankey test",
    font = dict(
      size = 10
    )
)
f = go.FigureWidget(data=[data], layout=layout)
plot(f)

Which renders:

Chrisvdberge
  • 1,824
  • 6
  • 24
  • 46

1 Answers1

4

Since I'm getting the following error with your snippet:

ValueError: Invalid property specified for object of type plotly.graph_objs.sankey.Node: 'groups'

And since I don't know what versions you are running of plotly, python (and Jupyter Notebook?), I would simply suggest that you restructure your source data and do the C1 and C2 grouping into simply C before you build your plot. And keep in mind that Links are assigned in the order they appear in dataset and that node colors are assigned in the order that the plot is built.

Plot:

enter image description here

Code:

# imports
import pandas as pd
import numpy as np
import plotly.graph_objs as go
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot

# settings
init_notebook_mode(connected=True)

# Nodes & links
nodes = [['ID', 'Label', 'Color'],
        [0,'A','blue'],
        [1,'B','yellow'],
        [2,'C','orange'],
        [3,'D','purple'],
        ]

# links with your data
links = [['Source','Target','Value','Link Color'],
        [0,1,3,'rgba(200, 205, 206, 0.6)'],
        [0,2,5,'rgba(200, 205, 206, 0.6)'],
        [0,3,5,'rgba(200, 205, 206, 0.6)'],
        [1,2,6,'rgba(200, 205, 206, 0.6)'],
        [2,3,6,'rgba(200, 205, 206, 0.6)'],
        ]

# Retrieve headers and build dataframes
nodes_headers = nodes.pop(0)
links_headers = links.pop(0)
df_nodes = pd.DataFrame(nodes, columns = nodes_headers)
df_links = pd.DataFrame(links, columns = links_headers)

# Sankey plot setup
data_trace = dict(
    type='sankey',
    domain = dict(
      x =  [0,1],
      y =  [0,1]
    ),
    orientation = "h",
    valueformat = ".0f",
    node = dict(
      pad = 10,
    # thickness = 30,
      line = dict(
        color = "black",
        width = 0
      ),
      label =  df_nodes['Label'].dropna(axis=0, how='any'),
      color = df_nodes['Color']
    ),
    link = dict(
      source = df_links['Source'].dropna(axis=0, how='any'),
      target = df_links['Target'].dropna(axis=0, how='any'),
      value = df_links['Value'].dropna(axis=0, how='any'),
      color = df_links['Link Color'].dropna(axis=0, how='any'),
  )
)

layout = dict(
        title = "Sankey Test",
    height = 772,
    font = dict(
      size = 10),)

fig = dict(data=[data_trace], layout=layout)
iplot(fig, validate=False)

My system info:

The version of the notebook server is: 5.6.0
The server is running on this version of Python:
Python 3.7.0 (default, Jun 28 2018, 08:04:48) [MSC v.1912 64 bit (AMD64)]
vestland
  • 55,229
  • 37
  • 187
  • 305