I am trying to do simple calculations on 4D image arrays (timeseries included), but the broadcasting eat up a lot of RAM compared to the initialized arrays, and I have already tried to read others with the some what similar problems. E.g. Memory growth with broadcast operations in NumPy
Here a comment from "rth" says "The broadcasting does not make additional memory allocations for the initial arrays"
and the accepted comment from "Warren Weckesser" show the problem with the askers use og newaxis that create an extra array that is allocated.
I tried doing what Warren showed, but I still get alot of RAM eaten up and I cannot figure out why. Right now I have implemented rths chunk calculation method with good results as such, but it still buggers me why the direct numpy calculations blow up in RAM usage.
Here is an example of what i do
I initialize the array that I will add the data to and create the random raw images of uint16 as it is coming from 16bit TIFF files with 12bit RAW image data. And I keep the rest float32 to save RAM. The last precision is not that important
import numpy as np
imagearraydensity = np.ones((512, 1024, 250, 20),dtype=np.float32)
imagearrayraw = np.random.randint(4095,size=(512, 1024, 250, 20),dtype=np.uint16)
I have two arrays of linear constants calculated before hand, here just random numbers
acons = np.random.random((512, 1024)).astype(np.float32)
bcons = np.random.random((512, 1024)).astype(np.float32)
Then the calculation
np.divide(np.subtract(imagearrayraw, bcons[:, :, np.newaxis, np.newaxis], dtype=np.float32),
acons[:, :, np.newaxis, np.newaxis],
imagearraydensity, # Output array position
dtype=np.float32)
My code with real data end up using ~ 36 GB RAM on my system with ~27 GB when done with the calculation.
Are there anything I can do to reduce the RAM usage from the direct broadcasting or is the best method the chunk based way as I have already implemented?