1

I would like to create a graph where nodes have a different weight and colour based on values in columns.

A sample of data is

Node Weight Colour Neighbours
1 23 red [3,5]
3 18 blue [2]
4 50 blue []
5 18 blue [1]
2 36 green [3]

The table above shows the links by nodes:

  • node 1 is linked with 3 and 5. It has weight 23 and colour red
  • node 2 is linked with 3. It has weight 36 and colour green
  • node 3 is linked with 2. It has weight 18 and colour blue
  • node 4 has not links. It has colour blue and weight 50
  • node 5 is linked with 1. It has weight 18 and colour blue

For building the network I have done as follows

d = dict(df.drop_duplicates(subset=['Node','Colour'])[['Node','Colour']].to_numpy().tolist())

nodes = G.nodes()
plt.figure(figsize=(30,30)) 
pos = nx.draw(G, with_labels=True, 
              nodelist=nodes,
              node_color=[d.get(i,'lightgreen') for i in nodes], 
              node_size=1000) 

Unfortunately the colours are all wrong! Also, I have difficulties to add information on weights. The ties among nodes should have weights assigned to them. I have tried with edge_attr='weight' in nx.draw, where 'weight' = df['Weight']. I hope you can give me some help and let me know what I have done wrong.

Tomerikoo
  • 18,379
  • 16
  • 47
  • 61
V_sqrt
  • 537
  • 8
  • 28

1 Answers1

1
node_color=[d.get(i,'lightgreen') for i in nodes], 

This method of coloring didn't work because you are assigning colors according to the nodes instead of their colours.

"The ties among nodes should have weights assigned to them. I have tried with edge_attr='weight' in nx.draw, where 'weight' = df['Weight']."*

For the solution I present, only the nodes have weights not the edges. If you pretend to do this, add a column on your data frame, e.g:

Node Weight Colour Neighbours Edge_Weights
1 23 red [3,5] [w1_3,w1_5]
3 18 blue [2] [w3_2]
4 50 blue [] []
5 18 blue [1] [w5_1]
2 36 green [3] [w2_3]

Next, add edge weight using G.add_edge(n1,n2,wight=w), here is some documentation.

Since you have multiple attributes you want to add to your nodes I would recommend iterating through your dataframe using for example df.itertuples().

Here is the full code:

df = pd.DataFrame(  data = {'Node': [1, 3, 4, 5, 2], 
                        'Weight': [23, 18 ,50, 18, 36], 
                        'Colour': ["red", "blue", "blue", "blue", "green"], 
                        'Neighbors': [[3,5], [2], [], [1], [3]]
                        }
              )
   

NROWS = None
def get_graph_from_pandas(df:
    
    G = nx.DiGraph() # assuming the graph is directed since e.g node 1 has 
                     # 3 as neighbour but 3 doesnt have 1 as neighbour
    
    
    for row in df.itertuples(): # row is the row of the dataframe
        n = row.Node   # change to row.(name of column with the node names)
        w = row.Weight # change to row.(name of column with the node weights)
        c = row.Colour # change to row.(name of column with the node colors)
        neighbors = row.Neighbors
        
        G.add_node(n, weight = w, colour = c)
        
        for neigh in neighbors:
            #add edge weights here, attribute of G.add_edge
            G.add_edge(n,neigh)  
            
    return G
        
        
        
G = get_graph_from_pandas(df)

print("Done.")
print("Total number of nodes: ", G.number_of_nodes())
print("Total number of edges: ", G.number_of_edges())

pos = nx.draw(G, with_labels=True, 
              node_color=[node[1]['colour'] for node in G.nodes(data=True)], 
              node_size=200)
willcrack
  • 1,794
  • 11
  • 20
  • 1
    Hi willcrack, thanks for your answer. Could you please explain me this line of code: `node_color=[node[1]['colour'] for node in G.nodes(data=True)]` I am getting the error: `KeyError: 'colour' `when I extend the graph to the whole dataset. Thanks a lot – V_sqrt Dec 08 '20 at 17:25
  • 1
    Hi! As long as you do: `c = row.Colour`, `G.add_node(n, weight = w, colour = c)` it shouldn't return that error – willcrack Dec 08 '20 at 17:29
  • 1
    `node_color=[node[1]['colour'] for node in G.nodes(data=True)]` This line uses [list comprehension](https://www.programiz.com/python-programming/list-comprehension) to return a list with the color of your nodes. – willcrack Dec 08 '20 at 17:30
  • Also, you can try: `for node in G.nodes(data=True): try: node[1]['colour'] except KeyError: print(node)` – willcrack Dec 08 '20 at 17:35
  • @Val I am sorry I didn't understand "_The only difference is actually the name of node (not it is called node1) that I changed in row.Node1. Do you think it is causing the issue?_" Can you explain? – willcrack Dec 08 '20 at 17:38
  • Sorry, I had problem with wifi. I have seen the problem with exception. When I print the ode, I get the following: `('a', {}) ('l', {}) ('t', {}) ('e', {})`. This happens because I have also word in Node column (these nodes should be actually one, as the word is `alte`). Basically, instead of numbers, I should consider words. I think this is causing the issue. I am trying to figure it out, but still without success. I have thought it could work also with a mix of number and words. But maybe I was wrong. I could create a new post, if you think that can be better – V_sqrt Dec 08 '20 at 19:12
  • 1
    I believe creating a new post is the best option – willcrack Dec 08 '20 at 19:15
  • But from the output you can see that the nodes are being created however they don’t have the weight and colour attributes – willcrack Dec 08 '20 at 19:16
  • I just created a new one: https://stackoverflow.com/questions/65205566/keyerror-when-assign-colours-to-nodes if you could please have a look, it would be great. Thanks a lot for your help and time – V_sqrt Dec 08 '20 at 19:25
  • 1
    Np @Val ;) glad I could help – willcrack Dec 08 '20 at 22:09