I want to create an empty Numpy array in Python, to later fill it with values. The code below generates a 1024x1024x1024 array with 2-byte integers, which means it should take at least 2GB in RAM.
>>> import numpy as np; from sys import getsizeof
>>> A = np.zeros((1024,1024,1024), dtype=np.int16)
>>> getsizeof(A)
2147483776
From getsizeof(A)
, we see that the array takes 2^31 + 128 bytes (presumably of header information.) However, using my task manager, I can see Python is only taking 18.7 MiB of memory.
Assuming the array is compressed, I assigned random values to each memory slot so that it could not be.
>>> for i in range(1024):
... for j in range(1024):
... for k in range(1024):
... A[i,j,k] = np.random.randint(32767, dtype = np.int16)
The loop is still running, and my RAM is slowly increasing (presumably as the arrays composing A inflate with the incompresible noise.) I'm assuming it would make my code faster to force numpy to expand this array from the beginning. Curiously, I haven't seen this documented anywhere!
So, 1. Why does numpy do this? and 2. How can I force numpy to allocate memory?