9

The sample data is as follows:

unique_list = ['home0', 'page_a0', 'page_b0', 'page_a1', 'page_b1', 
               'page_c1', 'page_b2', 'page_a2', 'page_c2', 'page_c3']
sources = [0, 0, 1, 2, 2, 3, 3, 4, 4, 7, 6]
targets = [3, 4, 4, 3, 5, 6, 8, 7, 8, 9, 9]
values = [2, 1, 1, 1, 1, 2, 1, 1, 1, 1, 2]

Using the sample code from the documentation

fig = go.Figure(data=[go.Sankey(
    node = dict(
      pad = 15,
      thickness = 20,
      line = dict(color = "black", width = 0.5),
      label = unique_list,
      color = "blue"
    ),
    link = dict(
      source = sources,
      target = targets,
      value = values
  ))])

fig.show()

This outputs the following sankey diagram

enter image description here

However, I would like to get all the values which end in the same number in the same vertical column, just like how the leftmost column has all of it's nodes ending with a 0. I see in the docs that it is possible to move the node positions, however I was wondering if there was a cleaner way to do it other than manually inputting x and y values. Any help appreciated.

vestland
  • 55,229
  • 37
  • 187
  • 305
bbd108
  • 958
  • 2
  • 10
  • 26

1 Answers1

14

In go.Sankey() set arrangement='snap' and adjust x and y positions in x=<list> and y=<list>. The following setup will place your nodes as requested.

Plot:

enter image description here

Please note that the y-values are not explicitly set in this example. As soon as there are more than one node for a common x-value, the y-values will be adjusted automatically for all nodes to be displayed in the same vertical position. If you do want to set all positions explicitly, just set arrangement='fixed'

Edit:

I've added a custom function nodify() that assigns identical x-positions to label names that have a common ending such as '0' in ['home0', 'page_a0', 'page_b0']. Now, if you as an example change page_c1 to page_c2 you'll get this:

enter image description here

Complete code:

import plotly.graph_objects as go
unique_list = ['home0', 'page_a0', 'page_b0', 'page_a1', 'page_b1', 
               'page_c1', 'page_b2', 'page_a2', 'page_c2', 'page_c3']
sources = [0, 0, 1, 2, 2, 3, 3, 4, 4, 7, 6]
targets = [3, 4, 4, 3, 5, 6, 8, 7, 8, 9, 9]
values = [2, 1, 1, 1, 1, 2, 1, 1, 1, 1, 2]


def nodify(node_names):
    node_names = unique_list
    # uniqe name endings
    ends = sorted(list(set([e[-1] for e in node_names])))
    
    # intervals
    steps = 1/len(ends)

    # x-values for each unique name ending
    # for input as node position
    nodes_x = {}
    xVal = 0
    for e in ends:
        nodes_x[str(e)] = xVal
        xVal += steps

    # x and y values in list form
    x_values = [nodes_x[n[-1]] for n in node_names]
    y_values = [0.1]*len(x_values)
    
    return x_values, y_values

nodified = nodify(node_names=unique_list)

# plotly setup
fig = go.Figure(data=[go.Sankey(
      arrangement='snap',
      node = dict(
      pad = 15,
      thickness = 20,
      line = dict(color = "black", width = 0.5),
      label = unique_list,
      color = "blue",
     x=nodified[0],
     y=nodified[1]
    ),
    link = dict(
      source = sources,
      target = targets,
      value = values
  ))])

fig.show()
vestland
  • 55,229
  • 37
  • 187
  • 305
  • Thank you for your explanation. It was really helpful, it helped in my case, too. I, however, do not understand why you have to specify y-s as well if you want to modify just x-s. In the case above, the result when skipping the line y=nodified[1] would be different. – Lan Feb 04 '22 at 18:02
  • 1
    @Lan It seems like plotly ignores the `x` argument unless you also include `y`. Similarly if `y` is simply an array of zeros. The final layout *does* seem to be affected by `y` values though. @vestland 's use of a constant 0.1 does seem to give the best results for me as well. – Geoff Jun 04 '22 at 08:53