0

I have a matrix (a scipy.sparse.csr.csr_matrix type) which looks like this:

(0, 31) 0.000528868711772147
(0, 32) 4.84173520932837e-05
(0, 33) 4.10541590795596e-05
(0, 34) 0.000408771225384504
(0, 35) 0.000795847618707398
:  :
(16086, 118806) 0.00047416210140481
(16086, 118809) 0.00856067420817794
(16086, 118826) 0.00420368450693882
(16086, 131832) 0.00111739160477843
(16086, 131905) 0.00389774479846667

I'm trying to pass it to a numpy array. I've tried using both .toarray() and .todense() but none of them seems to be working since I'm getting the following error:

Unable to allocate 18.0 GiB for an array with shape (16087, 150360) and data type float64

Do you have any idea how to do that? Thanks in advance.

brenda
  • 656
  • 8
  • 24
  • What do you mean by "none of them seems to be working"? That's too vague of a description. Is the some sort of error? working, but unexpected results? – hpaulj Feb 11 '21 at 00:10
  • It is a memory error. – brenda Feb 11 '21 at 00:11
  • 2
    That's what I suspected, but I wanted you to tell the whole world. Do you understand why? – hpaulj Feb 11 '21 at 00:13
  • Not really, I would really appreciate your help. – brenda Feb 11 '21 at 00:15
  • 2
    Doesn't the error explain it? The dense array is way too large for your memory. Look at the numbers . 18 GiB. That's `16087*150360*8/1e9`! – hpaulj Feb 11 '21 at 00:32
  • Take a look at this thread, this may help. https://stackoverflow.com/questions/66165897/scipy-large-sparse-array-dimensions-memoryerror – qwerty Feb 25 '22 at 22:58

1 Answers1

-1

your ram is not enogh to make such kind of operation devide data in chunk and do this operation chuck by chunk