I recently saw that when creating a numpy array via np.empty
or np.zeros
, the memory of that numpy array is not actually allocated by the operating system as discussed in this answer (and this question), because numpy utilizes calloc
to allocate the array's memory.
In fact, the OS isn't even "really" allocating that memory until you try to access it.
Therefore,
l = np.zeros(2**28)
does not increase the utilized memory the system reports, e.g., in htop
.
Only once I touch the memory, for instance by executing
np.add(l, 0, out=l)
the utilized memory is increased.
Because of that behaviour I have got a couple of questions:
1. Is touched memory copied under the hood?
If I touch chunks of the memory only after a while, is the content of the numpy array copied under the hood by the operating system to guarantee that the memory is contiguous?
i = 100
f[:i] = 3
while True:
... # Do stuff
f[i] = ... # Once the memory "behind" the already allocated chunk of memory is filled
# with other stuff, does the operating system reallocate the memory and
# copy the already filled part of the array to the new location?
i = i + 1
2. Touching the last element
As the memory of the numpy array is continguous in memory, I tought
f[-1] = 3
might require the enitre block of memory to be allocated (without touching the entire memory). However, it does not, the utilized memory in htop does not increase by the size of the array. Why is that not the case?