2

I am using scipy's LinearNDInterpolator from the interpolate module, and I'm losing memory somewhere. It would be great if someone could tell me how to recover it. I'm doing something like the following (where I've tracked memory usage on the side):

import numpy as np
from scipy import interpolate as irp # mem: 14.7 MB

X = np.random.random_sample( (2**18,2) ) # mem: 18.7 MB
Y = np.random.random_sample( (2**18,1) ) # mem: 20.7 MB
f = irp.LinearNDInterpolator( X, Y ) # mem: 85.9 MB
del f # mem: 57.9 MB

The interpolation I'm doing is much smaller but many times leading to an eventual crash. Can anyone say where this extra memory is hanging out and how I can recover it?

Edit 1:

output of memory_profiler:

Line #    Mem usage    Increment   Line Contents
================================================
4   15.684 MiB    0.000 MiB   @profile
5                             def wrapper():
6   19.684 MiB    4.000 MiB     X = np.random.random_sample( (2**18,2) )
7   21.684 MiB    2.000 MiB     Y = np.random.random_sample( (2**18,1) )
8   86.699 MiB   65.016 MiB     f = irp.LinearNDInterpolator( X, Y )
9   58.703 MiB  -27.996 MiB     del f

Edit 2:

The actual code I'm running is below. Each xtr is (2*w^2,w^2) uint8. It works until I get to w=61, but only if I run each w separately (so r_[21] ... r_[51] and running each). Strangely, each less than 61 still hogs all the memory, but its not until 61 that it bottoms out.

from numpy import *
from scipy import interpolate as irp

for w in r_[ 21:72:10 ]:
    print w
    t = linspace(-1,1,w)
    xx,yy = meshgrid(t,t)
    xx,yy = xx.flatten(), yy.flatten()
    P = c_[sign(xx)*abs(xx)**0.65, sign(yy)*abs(yy)**0.65]
    del t

    x = load('../../windows/%d/raw/xtr.npy'%w)
    xo = zeros(x.shape,dtype=uint8)
    for i in range(x.shape[0]):
        f = irp.LinearNDInterpolator( P, x[i,:] )
        out = f( xx, yy )
        xo[i,:] = out
        del f, out

    save('../../windows/%d/lens/xtr.npy'%w,xo)
    del x, xo

It errors on 61 with this message:

Python(10783) malloc: *** mmap(size=16777216) failed (error code=12)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
Traceback (most recent call last):
File "make_lens.py", line 16, in <module>
f = irp.LinearNDInterpolator( P, x[i,:] )
File "interpnd.pyx", line 204, in scipy.interpolate.interpnd.LinearNDInterpolator.__init__ (scipy/interpolate/interpnd.c:3794)
File "qhull.pyx", line 1703, in scipy.spatial.qhull.Delaunay.__init__ (scipy/spatial/qhull.c:13267)
File "qhull.pyx", line 1432, in scipy.spatial.qhull._QhullUser.__init__ (scipy/spatial/qhull.c:11989)
File "qhull.pyx", line 1712, in scipy.spatial.qhull.Delaunay._update (scipy/spatial/qhull.c:13470)
File "qhull.pyx", line 526, in scipy.spatial.qhull._Qhull.get_simplex_facet_array (scipy/spatial/qhull.c:5453)
File "qhull.pyx", line 594, in scipy.spatial.qhull._Qhull._get_simplex_facet_array (scipy/spatial/qhull.c:6010)
MemoryError

Edit 3:

A link to code like identical to above, but independent of my data:

http://pastebin.com/BKYzVVTS

I receive the same error as above. I'm on a intel core 2 duo macbook with 2GB of RAM. The read x, and write xo combine only to ~53MB, yet memory usage crawls far beyond what is needed as the loop progresses.

Matt Hancock
  • 3,870
  • 4
  • 30
  • 44
  • How are you measuring memory consumption? Python may not release all memory, but it keeps it in its internal cache for quicker reassignments. – David Zwicker Mar 19 '14 at 12:14
  • I kept track of the above with activity monitor on OSX. Is there a way to release memory from this internal cache? I'm running something like the above in a loop a large number of times and the accumulation leads to a crash. – Matt Hancock Mar 19 '14 at 14:42
  • Ahh, that's a different situation. Python should not accumulate memory in this case. The activity monitor is not a good way of profiling memory in Python, though. Check out this post for a superior way: http://stackoverflow.com/questions/552744/how-do-i-profile-memory-usage-in-python – David Zwicker Mar 19 '14 at 16:40
  • I followed your link and installed memory_profiler, wrapped and decorated the relevant code, and edited the above question to include the output. Not much difference. – Matt Hancock Mar 19 '14 at 18:23
  • I tried looking into the scipy source code, but the `LinearNDInterpolator` class is hidden inside some compiled python file (for speed reasons, I suppose). I don't understand the scipy source code well enough to dig deeper, but it seems as if scipy doesn't handle your memory correctly. – David Zwicker Mar 19 '14 at 18:52
  • LinearNDInterpolator uses [qhull](http://www.qhull.org/). I've posted the actual code I'm attempting to run as well as the error message upon memory exhaustion. – Matt Hancock Mar 19 '14 at 20:07
  • You might want to take a look at the `gc` module - it helps you get more control over garbage collection (including why some objects might not get cleared). See for example http://code.activestate.com/recipes/65333/ – Floris Mar 19 '14 at 20:37
  • Yes I already tried adding a `gc.collect()` after `del f, out` in the inner loop. Same memory inflation :( – Matt Hancock Mar 19 '14 at 20:39

1 Answers1

0

This issue was fixed in a later version of SciPy which I was not using:

https://github.com/scipy/scipy/issues/3471

Matt Hancock
  • 3,870
  • 4
  • 30
  • 44