1

I would like to hide specific nodes (in my case, the rightmost) while preserving the size of intermediate nodes. As a simplistic example:

import plotly.graph_objects as go

link_data = dict(
    source = [0,1,1],
    target = [1,2,2],
    value = [1,1,1]
)

node_data = dict(
    label = ['a','b','c'],
)

fig = go.Figure(
    data = [go.Sankey(
        node = node_data,
        link = link_data
    )]
)
fig.show()

Results in:

with_unwanted_c

But I want something more like this:

what_I_want

Some approaches I've tried:

  • I can remove the extra b-to-c connection and feed it back to b. This preserves the height of node b, but adds a circular link (which I don't want). This might be ok if I could remove the loop.
  • I can specify link colors as ['grey','white','white] (or 'rgba(0,0,0,0) in place of 'white') and node colors as ['blue','blue','white'], but this isn't the best looking: it adds a large pad of space to the right. And this seems like it adds unnecessary elements to the figure (more important to me for performance when I my figure is complex).

-Python 3.8, Plotly 5.3.1

Docuemada
  • 1,703
  • 2
  • 25
  • 44
  • @Rob Raymond 's answer is effectively the workaround that I took as of 10/10/2012. Should additional functionality be added to plotly's Sankey diagram that allows for a simple solution without the limitations (i.e., padding and unwanted roll-over) we can add it. – Docuemada Oct 12 '21 at 14:28

1 Answers1

2
  • re-using this approach to creating a sankey plot plotly sankey graph data formatting
  • I used a slightly more sophisticated approach that is similar to your second approach. This as you have noted does mean two things
    1. there is space to right of chart
    2. hover info still there !
  • have extended sample data to show node d is invisible as well as it's an end node with no flows going out of it
import pandas as pd
import numpy as np
import plotly.graph_objects as go
import plotly.express as px

links = [
    {"source": "a", "target": "b", "value": 1},
    {"source": "b", "target": "c", "value": 1},
    {"source": "b", "target": "c", "value": 1},
    {"source": "b", "target": "d", "value": 1}
]

df = pd.DataFrame(links)
nodes = np.unique(df[["source", "target"]], axis=None)
nodes = pd.Series(index=nodes, data=range(len(nodes)))
invisible = set(df["target"]) - set(df["source"])

fig = go.Figure(
    go.Sankey(
        node={
            "label": [n if not n in invisible else "" for n in nodes.index],
            "color": [
                px.colors.qualitative.Plotly[i%len(px.colors.qualitative.Plotly)]
                if not n in invisible
                else "rgba(0,0,0,0)"
                for i, n in enumerate(nodes.index)
            ],
            "line": {"width": 0},
        },
        link={
            "source": nodes.loc[df["source"]],
            "target": nodes.loc[df["target"]],
            "value": df["value"],
            "color": [
                "lightgray" if not n in invisible else "rgba(0,0,0,0)"
                for n in df["target"]
            ],
        },
    )
)

fig

enter image description here

Rob Raymond
  • 29,118
  • 3
  • 14
  • 30