new to network graphs so was hoping for a little guidance...
I'm trying to create a network graph between users collaborating with each other, my problem is I cannot figure out how to add multiple dimensions to the network.
At a high level I want to show:
- User-to-User interactions
- Add some sort of size indication of users who are collaborating more (via the edges, the more interactions the thicker the line between the two users).
- Add color to the edges/lines indicating the project they worked on together
- Add color to node based on user license type
So something like:
So far I have the following:
import pandas as pd
import networkx as nx
from pyvis.network import Network
df_dict = {'PROJECT': {0: 'Finance Project', 1: 'Finance Project', 2: 'Finance Project', 3: 'Finance Project', 4: 'Finance Project', 5: 'Finance Project', 6: 'Finance Project', 7: 'Finance Project', 8: 'Finance Project', 9: 'Finance Project', 10: 'Finance Project', 11: 'Finance Project', 12: 'HR Project', 13: 'Finance Project', 14: 'HR Project', 15: 'Finance Project'},
'PLAN': {0: 'COMPANY', 1: 'COMPANY', 2: 'COMPANY', 3: 'COMPANY', 4: 'COMPANY', 5: 'COMPANY', 6: 'COMPANY', 7: 'COMPANY', 8: 'COMPANY', 9: 'COMPANY', 10: 'COMPANY', 11: 'COMPANY', 12: 'COMPANY', 13: 'COMPANY', 14: 'COMPANY', 15: 'COMPANY'},
'USER_ONE': {0: 'Mike Jones', 1: 'Eminem', 2: 'Mike Jones', 3: 'Mike Jones', 4: 'Michael Jordan', 5: 'Eminem', 6: 'Michael Jordan', 7: 'Michael Jordan', 8: 'Mike Jones', 9: 'Kobe Bryant', 10: 'Eminem', 11: 'Elon Musk', 12: 'Bill Gates', 13: 'Elon Musk', 14: 'Mark Zuckerberg', 15: 'Elon Musk'},
'USER_ONE_LICENSE': {0: 'FULL', 1: 'FULL', 2: 'FULL', 3: 'FULL', 4: 'FULL', 5: 'FULL', 6: 'FULL', 7: 'FULL', 8: 'FULL', 9: 'OCCASIONAL', 10: 'FULL', 11: 'FULL', 12: 'FULL', 13: 'FULL', 14: 'FULL', 15: 'FULL'},
'USER_ONE_LICENSE_COLOR': {0: 'lightgreen', 1: 'lightgreen', 2: 'lightgreen', 3: 'lightgreen', 4: 'lightgreen', 5: 'lightgreen', 6: 'lightgreen', 7: 'lightgreen', 8: 'lightgreen', 9: 'gray', 10: 'lightgreen', 11: 'lightgreen', 12: 'lightgreen', 13: 'lightgreen', 14: 'lightgreen', 15: 'lightgreen'},
'USER_ONE_DAYS_COLLAB': {0: 88, 1: 55, 2: 67, 3: 1, 4: 70, 5: 54, 6: 2, 7: 114, 8: 4, 9: 1, 10: 10, 11: 19, 12: 5, 13: 11, 14: 100, 15: 13},
'USER_TWO': {0: 'Michael Jordan', 1: 'Mike Jones', 2: 'Eminem', 3: 'Kobe Bryant', 4: 'Eminem', 5: 'Michael Jordan', 6: 'Elon Musk', 7: 'Mike Jones', 8: 'Elon Musk', 9: 'Mike Jones', 10: 'Elon Musk', 11: 'Eminem', 12: 'Mark Zuckerberg', 13: 'Michael Jordan', 14: 'Bill Gates', 15: 'Mike Jones'},
'USER_TWO_LICENSE': {0: 'FULL', 1: 'FULL', 2: 'FULL', 3: 'OCCASIONAL', 4: 'FULL', 5: 'FULL', 6: 'FULL', 7: 'FULL', 8: 'FULL', 9: 'FULL', 10: 'FULL', 11: 'FULL', 12: 'FULL', 13: 'FULL', 14: 'FULL', 15: 'FULL'},
'USER_TWO_LICENSE_COLOR': {0: 'lightgreen', 1: 'lightgreen', 2: 'lightgreen', 3: 'gray', 4: 'lightgreen', 5: 'lightgreen', 6: 'lightgreen', 7: 'lightgreen', 8: 'lightgreen', 9: 'lightgreen', 10: 'lightgreen', 11: 'lightgreen', 12: 'lightgreen', 13: 'lightgreen', 14: 'lightgreen', 15: 'lightgreen'},
'USER_TWO_DAYS_COLLAB': {0: 114, 1: 67, 2: 55, 3: 1, 4: 54, 5: 70, 6: 11, 7: 88, 8: 13, 9: 1, 10: 19, 11: 10, 12: 100, 13: 2, 14: 5, 15: 4}
, 'TOTAL_COLLABS': {0: 202, 1: 122, 2: 122, 3: 2, 4: 124, 5: 124, 6: 13, 7: 202, 8: 17, 9: 2, 10: 29, 11: 29, 12: 105, 13: 13, 14: 105, 15: 17}}
df = pd.DataFrame(df_dict)
# where do I add all the other attributes?
#i.e. license type, project, # of interactions (I'm assuming this can be something like weights?)
#In my case I believe my source + target needs to be Project + User?
G = nx.from_pandas_edgelist(df
,source='USER_ONE'
,target='USER_TWO' #I tried ['PROJECT','USER_TWO']
)
net = Network(notebook=True)
net.from_nx(G)
net.show_buttons(filter_=True)
net.show('example4.html')
All the examples I've seen only have one source and one target - mine needs user + project for both source and target. Is there a way to do this without creating one field that combines both?
Haven't found a clear way to color nodes , the example provided shows a case on the node value (in my case the node is just text, i have another dimension i want to refer to to dictate color)
Haven't found a clear way to build a case statement on edge width either. Concretely:
- if count of interactions <= 1 then "small width"
- if count of interactions > 1 and <=5 then "medium width" etc...
Any direction or resources would be greatly appreciated -- everything I come across seems to be different than my setup leaving me unsure how to proceed.
my table looks something like this for reference: