The size of the data you are creating will depend on the matrix size and the precision-type of the data.
You are trying to use np.random.normal
that creates a matrix with float64 precision type values. The 64 number means that your are using 64 bits for each number, so each number will require a memory of 8 bytes (8bits per byte). If your matrix has a shape/dimension of 4000x794832
, that means you need ~23.7GB [4000*794832*8] of memory allocation.
If you have a 16GB RAM it shouldn't be enough, so as it will use SWAP (if enough defined) it may take some time to create it, or just run out of memory.
The question is, do you need a float64 precision? Because it seems to be much for usual scientist developments. So maybe to speed-up any following mathematical operations, you can consider to change your matrix precision type to float16 for example [4000*794832*2].
import numpy as np
a = np.random.normal(0, 0.7**2, size=(4000,794832))
a.nbytes # will give a size of 25434624000 [~23.7GB] (huge number)
b = np.random.normal(0, 0.7**2, size=(4000,794832)).astype(np.float16)
b.nbytes # will give a size of 6358656000 [~5.9GB](big but at least you can do everything on RAM)
The problem on this case is that np.random.normal
hasn't got option to define directly numpy dtype
, so you will create a float64 matrix and then convert it, which is not a very efficient option. But if haven't got any other choice...