Is there a standard structure for adding edges from a csv/txt into NetworkX? I've read the docs and have tried using read_edgelist('path.csv')
and add_edges_from('path.csv')
but have received errors saying my data cannot be converted into dictionaries, and also "Edge tuple C be a 2-tuple or a 3-tuple". I've reformatted a sample of my data several ways to test different structures including lists of lists and lists of tuples, removing white space and also creating a single list of numbers in each row, but no luck. Below is some sample data of mine:
user_id,cluster_moves
11011,"[[86, 110], [110, 110]]"
2139671,"[[89, 125]]"
3945641,"[[36, 73], [73, 110], [110, 110]]"
10024312,"[[123, 27], [27, 97], [97, 97], [97, 97], [97,110]]"
14270422,"[[0, 110], [110, 174]]"
14283758,"[[110, 184]]"
14373703,"[[35, 97], [97, 97], [97, 97], [97, 17], [17,58]]"
The purpose is to create a network graph of trajectories moving between (or within) clusters. Each list is a move either within a cluster, or between a cluster, e.g., [[0, 110], [110,174]]
is a move from clusters 0->110->174
. Is there a way to format my data such that networkx might be able to read it?
Quick sample code I was testing data with:
import networkx as nx
import matplotlib.pyplot as plt
g = nx.Graph()
edges = g.add_edges_from('path.csv')
nx.draw(g)
plt.draw
plt.show()
Edit
Is it possible to add edge weights to this data structure when reading in networkx
, and then adjust the weight based on the count/frequency of an edge? I would like to do this so I can visualize edges that have a higher frequency/count as another color/line weight. Using the answer below, I have tried using g.add_weighted_edges_from()
and using weight=1
as an attribute instead of using g.add_edges_from()
, but this did not work properly. I also tried using this with no luck:
for u,v,d in g.edges():
d['weight'] = 1
g.edges(data=True)
edges = g.edges()
weights = [g[u][v]['weight'] for u,v in edges]