My question is simple; and I could not find a resource that answers it. Somewhat similar links are using asarray, on numbers in general, and the most succinct one here.
How can I "calculate" the overhead of loading a numpy array into RAM (if there is any overhead)? Or, how to determine the least amount of RAM needed to hold all arrays in memory (without time-consuming trial and error)?
In short, I have several numpy arrays of shape (x, 1323000, 1)
, with x being as high as 6000
. This leads to a disk usage of 30GB for the largest file.
All files together need 50
GB. Is it therefore enough if I use slightly more than 50
GB as RAM (using Kubernetes)? I want to use the RAM as efficiently as possible, so just using 100
GBs is not an option.