1

I want to achieve a visualization that shows the frequency of changes between one state to another, being these states represented as numbers. My data looks like this and is call df_sankey

I was thinking on a Sankey Diagram following the example from the documentation. So I want one column with the states A as I1, I2, ... , I20 and another column with the states B as F1, F2, ..., F20. Then the frequency between every pair of values will be represented as a weighted line as follows.

However, I can't sort the nodes in the columns according to the number of state. This is what I want to achieve.

This is what I have tried:

#Create Labels
source = pd.DataFrame(np.arange(1,21), columns = ['source'])['source'].apply(lambda x: 'I' + str(x))
target = pd.DataFrame(np.arange(1,21), columns = ['target'])['target'].apply(lambda x: 'F' + str(x))
labels = pd.concat([source, target], axis=0).reset_index(drop=True)

#X-node
x_node = np.concatenate((np.ones(int(len(source)))*0.1, np.ones(int(len(target)))), axis = None)

#Y-node
y_node = np.tile(np.linspace(0,100,len(source)),2)

#Create Dataframe
df_nodes = pd.DataFrame(data = {'label': labels, 'X': x_node, 'Y': y_node})

#PLOT

fig = go.Figure(data=[go.Sankey(
    arrangement='snap',
    node = dict(
      pad = 15,
      thickness = 20,
      line = dict(color = "black", width = 0.5),
      label = df_nodes['label'],
      color = "blue",
      x = df_nodes['X'],
      y = df_nodes['Y']
    ),
    link = dict(
      source = df_sankey['State_A']-1, #Indices correspond to labels, eg A1, A2, A1, B1, ...
      target = df_sankey['State_B']+20-1,
      value = df_sankey['Freq']
  ))])

fig.update_layout(title_text="Basic Sankey Diagram", font_size=10)
fig.show()

Any ideas?

Mr. T
  • 11,960
  • 10
  • 32
  • 54
  • 1
    A screenshot of your dataframe is not very helpful. Please share a sample of your data as described [here](https://stackoverflow.com/questions/63163251/pandas-how-to-easily-share-a-sample-dataframe-using-df-to-dict/63163254#63163254) – vestland Jan 26 '21 at 19:20

0 Answers0