How to get the blocks back from a scipy sparse block matrix?

Question

After some vectorized calculations, I get a sparse block matrix with all my results stacked in blocks of same size.

>>> A = [[1, 1],
...      [1, 1]]
>>> B = [[2, 2],
...      [2, 2]]
>>> C = [[3, 3],
...      [3, 3]]
>>> results = scipy.sparse.block_diag(A, B, C)
>>> print(results.toarray())
[[1 1 0 0 0 0]
 [1 1 0 0 0 0]
 [0 0 2 2 0 0]
 [0 0 2 2 0 0]
 [0 0 0 0 3 3]
 [0 0 0 0 3 3]]

How can I get back these arrays A,B,C in an efficient way, if necessery by providing their shape (2,2)?

[Related](https://stackoverflow.com/questions/31527755/extract-blocks-or-patches-from-numpy-array). — Cleb, Dec 01 '18 at 19:31
Thanks! That's a way to go but it doesn't take advantage of the sparsity. — TBaz, Dec 01 '18 at 20:39

hpaulj · Accepted Answer · 2018-12-01T21:03:44.173

In [177]: >>> A = [[1, 1],
     ...: ...      [1, 1]]
     ...: >>> B = [[2, 2],
     ...: ...      [2, 2]]
     ...: >>> C = [[3, 3],
     ...: ...      [3, 3]]
     ...: >>> results = sparse.block_diag([A, B, C])
     ...:      
In [178]: results
Out[178]: 
<6x6 sparse matrix of type '<class 'numpy.int64'>'
    with 12 stored elements in COOrdinate format>

block_diag does not preserve the inputs; rather it creates coo format matrix, representing the whole matrix, not the pieces.

In [194]: results.data
Out[194]: array([1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3], dtype=int64)
In [195]: results.row
Out[195]: array([0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5], dtype=int32)
In [196]: results.col
Out[196]: array([0, 1, 0, 1, 2, 3, 2, 3, 4, 5, 4, 5], dtype=int32)


In [179]: results.A
Out[179]: 
array([[1, 1, 0, 0, 0, 0],
       [1, 1, 0, 0, 0, 0],
       [0, 0, 2, 2, 0, 0],
       [0, 0, 2, 2, 0, 0],
       [0, 0, 0, 0, 3, 3],
       [0, 0, 0, 0, 3, 3]], dtype=int64)

block_diag pass the arrays to sparse.bmat. That in turn makes a coo matrix from each, and then merges the coo attributes into 3 arrays, which are inputs to the global sparse matrix.

There is another sparse format bsr that may preserve the blocks (until conversion to csr for calculation), but I'll have to experiment to see that's the case.

Let's make a bsr from that results coo:

In [186]: bresults = sparse.bsr_matrix(results)
In [187]: bresults
Out[187]: 
<6x6 sparse matrix of type '<class 'numpy.int64'>'
    with 12 stored elements (blocksize = 2x2) in Block Sparse Row format>
In [188]: bresults.blocksize
Out[188]: (2, 2)
In [189]: bresults.data
Out[189]: 
array([[[1, 1],
        [1, 1]],

       [[2, 2],
        [2, 2]],

       [[3, 3],
        [3, 3]]], dtype=int64)

So it deduces that there are blocks, just as you desired.

In [191]: bresults.indices
Out[191]: array([0, 1, 2], dtype=int32)
In [192]: bresults.indptr
Out[192]: array([0, 1, 2, 3], dtype=int32)

So it's a csr like storage, but with the data arranged in blocks.

It may be possible to construct this from your A,B,C without the block_diag intermediary, but I'd have to look at the docs more.

Great answer ! In my case, it enables me to do all my calculations in csr format, then convert it back to bsr format that deduces well the blocks from there and I can get it back in the bresults.data . Thanks a lot, I was not aware that the conversion to bsr format would implicitely recover the structure. — TBaz, Dec 01 '18 at 21:23

druskacik · Answer 2 · 2018-12-01T19:37:30.353

that's a funny little problem.

I don't think there is a function that solves this in one line, but there's a way to do it programmatically.

Check out what res.data prints out, I use it here.

This works when shapes are all the same.

from scipy.sparse import block_diag

a = [[1, 2, 4],
    [3, 4, 4]]
b = [[2, 2, 1],
    [2, 2, 1]]
c = [[3, 3, 6],
    [3, 3, 6]]

res = block_diag((a, b, c))

def goBack(res, shape):
    s = shape[0]*shape[1]
    num = int(len(res.data)/s)
    for i in range (num):
        mat = res.data[i*s:(i+1)*s].reshape(shape)
        print(mat)

goBack(res, [2,3])

Output:

[[1 2 4]
 [3 4 4]]
[[2 2 1]
 [2 2 1]]
[[3 3 6]
 [3 3 6]]

Edit:

Okay, this does not work when any of the elements of the provided matrices is zero, as then it would not be counted in res.data.

Also, forget it, the link provided by cleb should help you.

This assumes a specific sparse matrix representation without specifying the assumption, and it assumes that the blocks are 100% dense, with no zeros. — user2357112, Dec 01 '18 at 19:34

How to get the blocks back from a scipy sparse block matrix?

2 Answers2

Linked