I am very confused by what is reported by numpy.ndarray.nbytes
.
I just created an identity matrix of size 1 million (10^6), which therefore has 1 trillion rows (10^12). Numpy reports that this array is 7.28TB, yet the python process only uses 3.98GB of memory, as reported by OSX activity monitor.
- Is the whole array contained in memory?
- Does Numpy somehow compress its representation, or is that handled by the OS?
- If I simply calculate
y = 2 * x
, which should be the same size asx
, the process memory increases to about 30GB, until it gets killed by the OS. Why, and what kind of operations can I conduct on x without the memory usage expanding so much?
This is the code I used:
import numpy as np
x = np.identity(1e6)
x.size
# 1000000000000
x.nbytes / 1024 ** 4
# 7.275957614183426
y = 2 * x
# python console exits and terminal shows: Killed: 9