0

I am trying to plot the adjacency matrix of the Berkeley-Stanford web graph with related edge list using python. The result should be as in this link. My first attempt was to use a dense representation as follows:

import matplotlib.pyplot as plt
import re

n=685230
matrix = [[0 for i in range(0,n)] for j in range(0,n)]

with open("web-BerkStan.txt", "r") as infile:
    for line in infile:
        li=line.strip()
        if not li.startswith("#"):
            i,j = re.split(r'\t+', li)
            matrix[int(i)][int(j)] = 1


f = plt.figure()

plt.imshow(matrix, cmap='binary', interpolation='nearest')
plt.show()

f.savefig('graph.pdf', bbox_inches='tight')

The code works for, e.g., n=10 but it is unfeasible for the size of the web graph I am considering, given n=685230.

Alternatively, I tried to replace the dense representation with the coordinate format:

matrix = []
with open("web-BerkStan.txt", "r") as infile:
    for line in infile:
        li=line.strip()
        if not li.startswith("#"):
            i,j = re.split(r'\t+', li)
            matrix.append([int(i),int(j)])

where matrix is now an edge list. How do I plot the edge list as a sparse matrix?

Matteo
  • 1
  • 1
  • Have you tried using `networkx` python package? Related [question](https://stackoverflow.com/questions/29572623/plot-networkx-graph-from-adjacency-matrix-in-csv-file). – syltruong Jul 13 '18 at 03:48
  • Yes, I have been trying to follow this [guide](http://sociograph.blogspot.com/2012/11/visualizing-adjacency-matrices-in-python.html) but it is not clear to me how to define a graph in `networkx` starting from an edge list. Also, in the guide it seems that they explicitly define an adjacency matrix which can be too large to store. – Matteo Jul 13 '18 at 04:20

0 Answers0