In https://stackoverflow.com/a/27771335/901925 I explore incremental matrix assignment.
lol
and dok
are the recommended formats if you want to change values. csr
will give you an efficiency warning, and coo
does not allow indexing.
But I also found that dok
indexing is slow compared to regular dictionary indexing. So for many changes it is better to build a plain dictionary (with the same tuple indexing), and build the dok
matrix from that.
But if you can calculate the H
data values with a fast numpy
vector operation, as opposed to iteration, it is best to do so, and construct the sparse matrix from that (e.g. coo
format). In fact even with iteration this would be faster:
h = np.zeros(A.shape)
for k, (i,j) in enumerate(zip(A,B)):
h[k] = compute_something
H = sparse.coo_matrix((h, (A, B)), shape=(n,m))
e.g.
In [780]: A=np.array([0,1,1,2]); B=np.array([0,2,2,1])
In [781]: h=np.zeros(A.shape)
In [782]: for k, (i,j) in enumerate(zip(A,B)):
h[k] = i+j+k
.....:
In [783]: h
Out[783]: array([ 0., 4., 5., 6.])
In [784]: M=sparse.coo_matrix((h,(A,B)),shape=(4,4))
In [785]: M
Out[785]:
<4x4 sparse matrix of type '<class 'numpy.float64'>'
with 4 stored elements in COOrdinate format>
In [786]: M.A
Out[786]:
array([[ 0., 0., 0., 0.],
[ 0., 0., 9., 0.],
[ 0., 6., 0., 0.],
[ 0., 0., 0., 0.]])
Note that the (1,2) value is the sum 4+5. That's part of the coo
to csr
conversion.
In this case I could have calculated h
with:
In [791]: A+B+np.arange(A.shape[0])
Out[791]: array([0, 4, 5, 6])
so there's no need for iteration.