1

I'm working with relatively large high-dimensional sparse arrays using scipy.sparse. The actual data and row/column indices are no issue to store.

The problem is I end up up with things like

sp.csr_matrix(([1], ([0], [0])), shape=(int(1e14), 1)).shape

which gives

MemoryError: Unable to allocate 728. TiB for an array with shape (100000000000001,) and data type int64

since it looks like scipy tries to allocate row/column masks (or something?)

Is there a good workaround for this? Would using coo_matrix fix it?

Update

It turns out I'm just an idiot and should have paid better attention to whether I was using a CSC matrix or CSR matrix.

CSC will compress the rows. CSR will compress the columns. For data stored like this (in what I am sure is a terrible format for making use of sparsity), CSC will work way better.

In any case, both of these work fine

wat = sp.csc_matrix(([1], ([0], [0])), shape=(int(1e14), 1))
wat2 = sp.csr_matrix(([1], ([0], [0])), shape=(1, int(1e14)))

and this is just a misunderstanding of what CSC and CSR do for us

b3m2a1
  • 158
  • 1
  • 7
  • 2
    The `indptr` attribute of a `csr` format has one value per row (plus 1). That's the array it can't create. – hpaulj Feb 12 '21 at 04:58

1 Answers1

0

If you are using 32bit version of python, please upgrade to 64bit (in case you have 64 bit hardware and OS). Further, it is memory error where your system available memory is not sufficient to handle a huge allocation (100000000000001,).

How much RAM is available at your system. If you can add more description about your system such as OS, RAM etc will be helpful.

See the link following related links with somehow (not exactly) the same issue Link 1 Link 2

Ahmad
  • 163
  • 1
  • 2
  • 9
  • ...I know where the error is coming from and I'm using a 64 bit build and that still won't help (do you know of a system with 728. TiB of RAM?). The issue is that `scipy` internally is allocating a massive row index array when I don't think it needs to. The question is really how to make it not do that. – b3m2a1 Feb 12 '21 at 03:34