I wrote a basic program to load a CSV edgelist into a network, calculate 4 metrics for each node in the network, and write the results to a CSV file. I'm using NetworkX
and everything has worked fine when using numbers as node ids. However, as I've moved to another example using Twitter usernames as node id's, I get the following error:
Error
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 23-24: invalid continuation byte
Code
import sys
import networkx as nx
import csv
# load CSV edgelist into NetworkX
G = nx.read_edgelist(sys.argv[1], delimiter=',')
# calculate centrality metrics
degree = nx.degree_centrality(G)
between = nx.betweenness_centrality(G)
close = nx.closeness_centrality(G)
eigen = nx.eigenvector_centrality(G)
# write centrality results to a list
centrality = []
for i in G:
row = i, degree[i], between[i], close[i], eigen[i]
centrality.append(row)
# write list to CSV
outfile = sys.argv[1].replace('.csv', '_metrics.csv')
header = 'NodeID', 'Degree', 'Betweenness', 'Closeness', 'Eigenvector'
with open(outfile, 'wb') as f:
csv.writer(f).writerow(header)
csv.writer(f).writerows(centrality)