I am trying to plot the adjacency matrix of the Berkeley-Stanford web graph with related edge list using python. The result should be as in this link. My first attempt was to use a dense representation as follows:
import matplotlib.pyplot as plt
import re
n=685230
matrix = [[0 for i in range(0,n)] for j in range(0,n)]
with open("web-BerkStan.txt", "r") as infile:
for line in infile:
li=line.strip()
if not li.startswith("#"):
i,j = re.split(r'\t+', li)
matrix[int(i)][int(j)] = 1
f = plt.figure()
plt.imshow(matrix, cmap='binary', interpolation='nearest')
plt.show()
f.savefig('graph.pdf', bbox_inches='tight')
The code works for, e.g., n=10
but it is unfeasible for the size of the web graph I am considering, given n=685230
.
Alternatively, I tried to replace the dense representation with the coordinate format:
matrix = []
with open("web-BerkStan.txt", "r") as infile:
for line in infile:
li=line.strip()
if not li.startswith("#"):
i,j = re.split(r'\t+', li)
matrix.append([int(i),int(j)])
where matrix
is now an edge list. How do I plot the edge list as a sparse matrix?