8

I have a python script which does many simulations for different parameters ( Q, K ), plots results and stores it to disk.

Each set of parameters ( Q,K ) produces a 3D volumetric grid of data 200x200x80 datapoints, which requires ~100 MB of data. A part of this volumetric grid is then plot, layer by layer, producing ~60 images.

The problem is that python obviously does not release memory during this process. I'm not sure where the memory leak is, or what the rules are governing how python decides which objects are deallocated. I'm also not sure if the memory is lost in numpy arrays or in matplotlib figure objects.

  1. Is there a simple way to analyze which objects in python persist in memory and which were automatically deallocated?
  2. Is there a way to force python to deallocate all arrays and figure objects which were created in particular loop cycle or in particular function call?

The relevant part of code is here ( however, it will not run ... the bigger part of the simulation code including ctypes C++/python interface is omitted because it is too complicated ):

import numpy as np
import matplotlib.pyplot as plt
import ProbeParticle as PP # this is my C++/Python simulation library, take it as blackbox

def relaxedScan3D( xTips, yTips, zTips ):
    ntips = len(zTips); 
    print " zTips : ",zTips
    rTips = np.zeros((ntips,3)) # is this array deallocated when exiting the function?
    rs    = np.zeros((ntips,3)) # and this?
    fs    = np.zeros((ntips,3)) # and this?
    rTips[:,0] = 1.0
    rTips[:,1] = 1.0
    rTips[:,2] = zTips 
    fzs    = np.zeros(( len(zTips), len(yTips ), len(xTips ) )); # and this?
    for ix,x in enumerate( xTips  ):
        print "relax ix:", ix
        rTips[:,0] = x
        for iy,y in enumerate( yTips  ):
            rTips[:,1] = y
            itrav = PP.relaxTipStroke( rTips, rs, fs ) / float( len(zTips) )
            fzs[:,iy,ix] = fs[:,2].copy()
    return fzs


def plotImages( prefix, F, slices ):
    for ii,i in enumerate(slices):
        print " plotting ", i
        plt.figure( figsize=( 10,10 ) ) # Is this figure deallocated when exiting the function ?
        plt.imshow( F[i], origin='image', interpolation=PP.params['imageInterpolation'], cmap=PP.params['colorscale'], extent=extent )
        z = zTips[i] - PP.params['moleculeShift' ][2]
        plt.colorbar();
        plt.xlabel(r' Tip_x $\AA$')
        plt.ylabel(r' Tip_y $\AA$')
        plt.title( r"Tip_z = %2.2f $\AA$" %z  )
        plt.savefig( prefix+'_%3.3i.png' %i, bbox_inches='tight' )

Ks = [ 0.125, 0.25, 0.5, 1.0 ]
Qs = [ -0.4, -0.3, -0.2, -0.1, 0.0, +0.1, +0.2, +0.3, +0.4 ]

for iq,Q in enumerate( Qs ):
    FF = FFLJ + FFel * Q
    PP.setFF_Pointer( FF )
    for ik,K in enumerate( Ks ):
        dirname = "Q%1.2fK%1.2f" %(Q,K)
        os.makedirs( dirname )
        PP.setTip( kSpring = np.array((K,K,0.0))/-PP.eVA_Nm )
        fzs = relaxedScan3D( xTips, yTips, zTips ) # is memory of "fzs" recycled or does it consume more memory each cycle of the loop ?
        PP.saveXSF( dirname+'/OutFz.xsf', headScan, lvecScan, fzs )
        dfs = PP.Fz2df( fzs, dz = dz, k0 = PP.params['kCantilever'], f0=PP.params['f0Cantilever'], n=int(PP.params['Amplitude']/dz) ) # is memory of "dfs" recycled?
        plotImages( dirname+"/df", dfs, slices = range( 0, len(dfs) ) )
user3666197
  • 1
  • 6
  • 50
  • 92
Prokop Hapala
  • 2,424
  • 2
  • 30
  • 59
  • 3
    The problem is that you're keeping all of the figures around and open. If you're going to use the `pyplot` state-machine interface, you need to explicitly close the figures each time. Otherwise they'll be kept around so that they can be displayed when you call `plt.show`. As a quick fix, call `plt.close()` after `plt.savefig`. – Joe Kington Sep 29 '15 at 13:41
  • Ahoj Prokope, try a taste of another way on how to harness matplotlib >>> **[`Interactive Applications Using Matplotlib; Benjamin V. Root, (2015)`]** – user3666197 Sep 29 '15 at 22:36
  • As an appetizer, may enjoy to look at an embedded MVC-**live-`matplotlib`-GUI sample >>> http://stackoverflow.com/a/25769600/3666197** – user3666197 Sep 29 '15 at 22:53
  • For memory management, read details about memory-profilers available for python. BTW python is more than hesitant to release a memory, the more to return it back to O/S. Distributed processes may allow you not to suffer from this in main thread / process in HPC scenarios. Most of the indicated issues are however solved ( prevented ) with a live-GUI or with a use of dumb-force `.clf()` / `.close()` methods. – user3666197 Sep 29 '15 at 23:01

1 Answers1

11

Try to reuse your figure:

plt.figure(0, figsize=(10, 10))
plt.clf() #clears figure

or close your figure after saving:

...
plt.savefig(...)
plt.close()
tillsten
  • 14,491
  • 5
  • 32
  • 41
  • 2
    aha, thanks, it seems that with `plt.close()` there is no memory leak any more. Still it would be good to have some clear idea what are the rules and how to analyse this leaks. – Prokop Hapala Sep 29 '15 at 14:24
  • There is no leak, if you use pyplot plt.figure will give you a new figure every time and the old one will be still available by calling `plt.figure(n)` where n is the number of the figure. – tillsten Sep 29 '15 at 14:55