0

I needed to create a numpy matrix with the following code

encoder_data = np.zeros((347101, 400, 347101), dtype='float32')

but when i execute that code it issues MemoryError so im thinking if i could the matrix into file would also be good.

So i came across on this answer that uses memmap to save the matrix to a file and i came up with the following code

encoder_input_data = np.memmap('encoder_input_data.memmap', dtype='float32', mode='w+', shape=(347101, 400, 347101))

but with the above code it issues another error "OSError: [Errno 22] Invalid argument"

Can someone help me save the same results of np.zeros((347101, 400, 347101), dtype='float32') to a file? Im new to python so i don't exactly know what to do now.

jameshwart lopez
  • 2,993
  • 6
  • 35
  • 65
  • 1
    Do you really want an array this large? This is going to create a massive file that you may not be able to store. Your error may be related to https://stackoverflow.com/a/17663651/6942527. – busybear Mar 22 '18 at 03:08
  • Yes i really need it. – jameshwart lopez Mar 22 '18 at 03:13
  • i dont know what your use case is but since its all zeros can you use sparse arrays from scipy? – MarkAWard Mar 22 '18 at 03:17
  • Well, you could look into [`np.save`](https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.save.html), which has the option of using pickling. I think that could help with compressing your data a bit. But sparse arrays are probably the way to go here. – Niayesh Isky Mar 22 '18 at 04:40
  • 2
    This sounds like an [XY problem](https://meta.stackexchange.com/a/66378): You ask how to create a huge file (175 Terabyte) but maybe the question should be how to avoid such an unwieldy data structure in the task you are solving. (If you are on 32 bit Python this blows the available address space; Don't know how 64 bit Python would take it...) – MB-F Mar 22 '18 at 08:03
  • @kazemakase you are right :) Basically i was trying to map audio to an image and since one audio is 400D and an image feature is 40D. So i was using https://github.com/keras-team/keras/blob/master/examples/lstm_seq2seq.py so i tried to treat 400d as the english characters and 40D as the french characters but it looks like i cant go this way. :( I was in the part of creating matrix for encoder_input_data and run into error :D – jameshwart lopez Mar 22 '18 at 08:28

0 Answers0