Why does this code doesn't free up RAM after finishing one iteration of the for-loop?

Question

I have the following Python code.

I know it's complex and long, but it doesn't matter what it actually does.

I don't understand and my question is why after a few iterations of the for-loop almost all my RAM is taken up by visual studio code. The amount of RAM memory taken up by VSC increases in bumps. I monitor it on Windows10 using CTRL+ALT+DEL --> TaksManager. When I start running this code, I have 7GB of RAM free, after 3 minutes, I have 1 GB of RAM free or less. I would have expected that each iteration of the for-loop to use some RAM and deallocate it when b increases by 1 (so at the beginning of the next iteration).

I want to solve this issue because I actually need to run that for-loop for A LOT of times to get some useful results. I do not have problems to wait for computations to finish, but it seems that in its current state, this code cannot produce the results I want because of this RAM issue. Please point me into the right direction on how to fix it.

The only thing which came into my mind is the fact that maybe VSC stores the graphs shown on screen using plt.show() in RAM, I deleted the plotting part of the code, but the same identical RAM-filling behaviour prevails even if no plotting occurs at any iteration of the for-loop.

from scipy import interpolate
from scipy.fft import fft, ifft
from scipy.constants import c, epsilon_0
import scipy.integrate as integrate
import numpy as np, math
from matplotlib import pyplot as plt

lambda_0 = 800 * 10**(-9)
omega_0 = 2*np.pi*c / lambda_0
delta_lambda = 50.0 * 10**(-9) # 50 nm, input in the Li (2017) paper. Can be modified accordingly to the needs.
delta_Tau = (1.47 * 10**(-3) * (lambda_0*10**9)**2 / (delta_lambda*10**9)) * 10**(-15)
delta_omega_PFT = (4*np.log(2) / delta_Tau) # equivalent (equal) to: ((4*ln(2)) / (2*pi)) * (2*pi/delta_Tau) = 0.441 * (2*pi/delta_Tau)
F = 2.0
f_number = 5
true_powers = []
diam_min, diam_max = 10.0 * 10**(-3),  500.0 * 10**(-3)
num_diams = 100
diameters = np.linspace(diam_min, diam_max, num_diams)
fluence = 0.2 * (10**4) # from Li (2017)
DL_spot_size = 2.44 * lambda_0 * f_number

def G(omega):
  return np.exp( -(delta_Tau**2/(8*np.log(2))) * (omega-omega_0)**2 )
xsi = 0.1 * (1.0 * 10**(-15)) / (1.0 * 10**(-3))

def phase_c_scalar(x, omega):
    return xsi * x * (omega-omega_0)

E0 = np.sqrt( (0.2*10**4 * 8 * np.sqrt(np.log(2))) / (delta_Tau*np.sqrt(np.pi)*c*epsilon_0/2.0) )  * np.sqrt(2*np.pi*np.log(2)) / delta_omega_PFT

for b in range(diameters.shape[0]):
    num_omegas_focus = 4 * 10**2
    maximum_time = 200.0 * 10**(-15)
    N = math.ceil( (np.pi*num_omegas_focus)/(2*delta_omega_PFT*maximum_time) ) - 1
    omega_max_focus = omega_0 + N*delta_omega_PFT
    omega_min_focus = omega_0 - N*delta_omega_PFT
    omegas_focus = np.linspace(omega_min_focus, omega_max_focus, num_omegas_focus) # shape (num_omegas_focus, )
    sampling_spacing_omegas_focus = np.true_divide((omega_max_focus-omega_min_focus) , num_omegas_focus)
    
    xi_min = -200.0 * 10**(-6) # um
    xi_max = 200.0 * 10**(-6) # um
    num_xis = 100
    xis = np.linspace(xi_min, xi_max, num_xis) 
    def prefactor(omega, xi):
        first = omega * np.exp(1j*omega*F/c) / (1j*2*np.pi*c*F)
        second = np.exp(1j*omega*xi**2/(2*F*c))
        return first * second

    def integrand_real(x, omega, xi):
        return np.real(  E0*G(omega)*np.exp(1j*phase_c_scalar(x, omega)) * np.exp(-1j*omega*xi*x/(c*F)) )

    def integrand_imag(x, omega, xi):
        return np.imag(  E0*G(omega)*np.exp(1j*phase_c_scalar(x, omega)) * np.exp(-1j*omega*xi*x/(c*F))  )

    Es_xi_omega = np.zeros((num_xis, num_omegas_focus), dtype=complex)
    for i in range(xis.shape[0]):
        for j in range(omegas_focus.shape[0]):
            integral_real = integrate.quad(integrand_real, -diameters[b]/2.0, diameters[b]/2.0, args=(omegas_focus[j], xis[i]))
            integral_imag = integrate.quad(integrand_imag, -diameters[b]/2.0, diameters[b]/2.0, args=(omegas_focus[j], xis[i]))
            Es_xi_omega[i, j] = prefactor(omegas_focus[j], xis[i]) * (integral_real[0] + 1j*integral_imag[0])
            print("We finished integral number {}".format(i*omegas_focus.shape[0]+j) + " out of {} integrals.".format(xis.shape[0]*omegas_focus.shape[0]))

    Es_xi_time = np.empty_like(Es_xi_omega, dtype=complex) # shape (xis.shape[0], omegas_focus.shape[0])
    for i in range(Es_xi_omega.shape[0]): # for each row (for each xi)
        Es_xi_time[i, :] = ifft(np.hamming(Es_xi_omega[i, :].shape[0])*Es_xi_omega[i, :]) * (sampling_spacing_omegas_focus/(2*np.pi)) * num_omegas_focus
        if i % 10000 == 0:
            print("We have done ifft number {}".format(i) + " out of a total of {}".format(Es_xi_omega.shape[0]) + " iffts")
    returned_times_ugly = np.fft.fftfreq(num_omegas_focus, d=(sampling_spacing_omegas_focus/(2*np.pi))) # shape (num_omegas_focus, )
    returned_times_pretty = np.fft.fftshift(returned_times_ugly)
    indices = (returned_times_ugly == returned_times_pretty[:, None]).argmax(1)
    Es_xi_time = np.take_along_axis(Es_xi_time, np.tile(indices, (Es_xi_time.shape[0], 1)), axis=1)

    returned_times_pretty_mesh, xis_mesh = np.meshgrid(returned_times_pretty, xis)
    fig, ax = plt.subplots()
    cc = ax.pcolormesh(returned_times_pretty_mesh, xis_mesh, np.real(Es_xi_time), cmap='viridis')
    fig.colorbar(cc, ax=ax, label=r'$[V/m]$')
    ax.set_xlabel("t [s]")
    ax.set_ylabel("xi [m]")
    plt.show()

    np.save("ReEfield_diameter_{}m.npy".format(diameters[b]), np.real(Es_xi_time) )
    np.save("returned_times_pretty_diameter_{}m.npy".format(diameters[b]), returned_times_pretty)
    np.save("xis_diameter_{}m.npy".format(diameters[b]), xis)

Thank you very much!

Without diving too deep into that `for` loop, you _may_ want to look into spawning a child process for **each** iteraton based on [this](https://stackoverflow.com/a/15492488/5963316) SO answer. I am relying on [this](https://stackoverflow.com/a/367599/5963316) to tell me that NumPy arrays load each consecutive element into memory, unlike traditional Python generators, because this is missing from the NumPy docs. [here](https://numpy.org/doc/stable/reference/generated/numpy.array.html). I.e., use a child process within that `for` loop or refactor away from NumPy arrays. — Mike, Dec 07 '21 at 23:16
This strange, it just takes 5 MiB of additional RAM per iteration on my machine with Numpy 1.20.3 and Python 3.8.1 on Windows. The overall interpreter do not take more than 90 MiB after the first iteration. This may be due to the IDE debugger or the IDE Intelliscense storing a lot of data for each Python variable... Can you reproduce the issue outside of VSC? Can you also reproduce the problem if you put the code in a main function? — Jérôme Richard, Dec 08 '21 at 00:10
@JérômeRichard, I can confirm that if I write in a nano a code.py with the exact same code (not included in a main function) and execute it with python3 code.py, the RAM memory keeps flat for more than 5 minutes now. I can leave it as it is for 1 hour for sure, or so, which is great. Thank you! It seems VSC was the problem. Or do you believe that maybe because inside the VSC, all this code was in a Jupyter Notebook? Can you also write an answer so that I can accept it? Thanks once again! — velenos14, Dec 08 '21 at 14:14
@Mike, thank you! I will have a look, however just by not running the code put inside a Jupyter Notebook opened with VSC, it runs without (visually observably) increasing the RAM memory, indefinitely (as many iterations of the for-loop as I want). Thank you for your time and help, I will learn something from those posts. — velenos14, Dec 08 '21 at 14:15
Note that usually, free'd memory goes back to the allocation pool for the individual process, _not_ back to the OS. So if you're looking in your operating system's process monitor, you may not see anything _even if_ the Python garbage collector really does recognize it as unused. (The above comment is made as a general note, without having reviewed the specific code in question; for the special case at hand, it sounds like the comments pointing at the IDE / debugger were on-point) — Charles Duffy, Dec 08 '21 at 15:55

Why does this code doesn't free up RAM after finishing one iteration of the for-loop?

0 Answers0