5

I have two scipy matrices 'a' and 'b' with boolean values in them. 'a' is way bigger than 'b': 765565 values against just 3 values.

In [211]: a
Out[211]: <388839x8455 sparse matrix of type '<class 'numpy.bool_'>'
           with 765565 stored elements in Compressed Sparse Row format>
In [212]: b
Out[212]: <5x3 sparse matrix of type '<class 'numpy.bool_'>'
           with 3 stored elements in Compressed Sparse Row format>

But when I check their sizes in terms of memory usage, I see that they are both just 56 bytes:

In [213]: from sys import getsizeof
          'Size of a: {}. Size of b: {}'.format(getsizeof(a), getsizeof(b))
Out[213]: 'Size of a: 56. Size of b: 56'

How come these matrices' sizes are the same, while matrix 'a' has to store over 200 thousand times more values than matrix 'b'?

Sergey Zakharov
  • 1,493
  • 3
  • 21
  • 40
  • 2
    [See here](https://stackoverflow.com/questions/449560/how-do-i-determine-the-size-of-an-object-in-python) `getsizeof` *Return the size of an object in bytes. The object can be any type of object. All built-in objects will return correct results, but this does not have to hold true for third-party extensions as it is implementation specific.* – Cory Kramer Apr 28 '17 at 13:24
  • 1
    @CoryKramer Thanks for the hint. `a.data.nbytes` works in this case. – Sergey Zakharov Apr 28 '17 at 13:45
  • 1
    @SergeyZakharov, don't forget to sum it up with `a.indptr.nbytes + a.indices.nbytes` – MaxU - stand with Ukraine Apr 28 '17 at 13:47

1 Answers1

3

Here is a small demo:

from scipy import sparse

M = sparse.random(10**4, 10**3, .001, 'csr')

def sparse_memory_usage(mat):
    try:
        return mat.data.nbytes + mat.indptr.nbytes + mat.indices.nbytes
    except AttributeError:
        return -1

In [140]: sparse_memory_usage(np.random.rand(100, 100))
Out[140]: -1

In [141]: M = sparse.random(10**4, 10**3, .001, 'csr')

In [142]: sparse_memory_usage(M)
Out[142]: 160004

In [144]: M
Out[144]:
<10000x1000 sparse matrix of type '<class 'numpy.float64'>'
        with 10000 stored elements in Compressed Sparse Row format>
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419