Here's an example of using a generator to populate a sparse matrix. I use the generator to fill a structured array, and create the sparse matrix from its fields.
import numpy as np
from scipy import sparse
N, M = 3,4
def foo(N,M):
# just a simple dense matrix of random data
cnt = 0
for i in xrange(N):
for j in xrange(M):
yield cnt, (i, j, np.random.random())
cnt += 1
dt = dt=np.dtype([('i',int), ('j',int), ('data',float)])
X = np.empty((N*M,), dtype=dt)
for cnt, tup in foo(N,M):
X[cnt] = tup
print X.shape
print X['i']
print X['j']
print X['data']
S = sparse.coo_matrix((X['data'], (X['i'], X['j'])), shape=(N,M))
print S.shape
print S.A
producing something like:
(12,)
[0 0 0 0 1 1 1 1 2 2 2 2]
[0 1 2 3 0 1 2 3 0 1 2 3]
[ 0.99268494 0.89277993 0.32847213 0.56583702 0.63482291 0.52278063
0.62564791 0.15356269 0.1554067 0.16644956 0.41444479 0.75105334]
(3, 4)
[[ 0.99268494 0.89277993 0.32847213 0.56583702]
[ 0.63482291 0.52278063 0.62564791 0.15356269]
[ 0.1554067 0.16644956 0.41444479 0.75105334]]
All of the nonzero data points will exist in memory in 2 forms - the fields of X
, and the row,col,data arrays of the sparse matrix.
A structured array like X
could also be loaded from the columns of a csv file.
A couple of the sparse matrix formats let you set data elements, e.g.
S = sparse.lil_matrix((N,M))
for cnt, tup in foo(N,M):
i,j,value = tup
S[i,j] = value
print S.A
sparse
tells me that lil
is the least expensive format for this type of assignment.