3

I run a server which acts as a data processing node for clients within the team. Recently we've been refactoring legacy code within the server to leverage numpy for some of the filtering/transform jobs.

As we have to serve this data out to remote clients, we convert the numpy data to various forms, using numpy.tolist() as an intermediary step.

Each query is stateless, there are no globals, and so between queries no references are be maintained.

In one particular step I get an apparent memory leak, which I have been trying to trackdown via memory_profiler. This step involves converting a large (>4m entries) ndarray of floats to a python list. The first time I issue the query the tolist() call allocates 120m of memory, and then deallocates 31m when I release the numpy array. The second (and subsequent times) I issue the identical query the allocation/deallocation is 31m. Each different query I issue has the same pattern, though with different absolute values.

I've torn apart my code, and forced in some del commands for illustrative purposes. Output, below, is from memory_profiler.profile

First issue of query:

Line #    Mem usage    Increment   Line Contents
================================================
   865    296.6 MiB      0.0 MiB           p = ikeyData[1]['value']
   866    417.2 MiB    120.6 MiB           newArr = p.tolist()
   867    417.2 MiB      0.0 MiB           del p
   868    385.6 MiB    -31.6 MiB           del ikeyData[1]['value']
   869    385.6 MiB      0.0 MiB           ikeyData[1]['value'] = newArr

Second(and subsequent) instances of same query:

Line #    Mem usage    Increment   Line Contents
================================================
   865    494.7 MiB      0.0 MiB           p = ikeyData[1]['value']
   866    526.3 MiB     31.6 MiB           newArr = p.tolist()
   867    526.3 MiB      0.0 MiB           del p
   868    494.7 MiB    -31.6 MiB           del ikeyData[1]['value']
   869    494.7 MiB      0.0 MiB           ikeyData[1]['value'] = newArr

As you can imagine, in a long-running process with highly variable queries, these allocations build up forcing us to regularly bounce the server.

Does anyone have thoughts as to what might be happening here?

Martin Y.
  • 51
  • 5
  • The first thought is to examine numpy's source. – ivan_pozdeev Dec 01 '15 at 14:21
  • If only the first run produces excessive allocation - how do they "build up"? – ivan_pozdeev Dec 01 '15 at 14:23
  • Related: http://stackoverflow.com/questions/1435415/python-memory-leaks – ivan_pozdeev Dec 01 '15 at 14:26
  • @ivan_pozdeev each unique query produces a different excessive initial allocation. We have thousands of unique queries on a given day. – Martin Y. Dec 01 '15 at 15:07
  • 1
    As the [linked question](http://stackoverflow.com/questions/1435415/python-memory-leaks) and [Mike's answer](http://stackoverflow.com/a/34022660/648265) say, you should be inspecting Python's GC rather than raw memory footprint. The latter could only be useful if the former shows nothing - then an excessive memory fragmentation is happening, and the memory management facilities are at fault, or `numpy` is leaking (both are extremely unlikely). There's really nothing else we can say without a [MCVE]. – ivan_pozdeev Dec 01 '15 at 16:52

1 Answers1

1

In your case Python has probably released the memory.

This does not mean that the memory allocator necessarily returns the memory to the operating system. memory_profiler uses system calls to find out the current amount of used memory. So probably nothing wrong with your code.

Mike Müller
  • 82,630
  • 20
  • 166
  • 161
  • If they "have to bounce the server regularly", there probably _is_ something wrong - whether with their code or some other. – ivan_pozdeev Dec 01 '15 at 14:26
  • The logging suggest a one-time use of 89 MiB that is not released. No sign of built-up there. Looks like the problem is somewhere else. – Mike Müller Dec 01 '15 at 14:59
  • @MikeMüller Each unique query produces a one-time allocation at that line of code. I'm the first to admit the problem could be elsewhere, but this is the only place I can find strange behaviour like this that correlates to the growth of memory over time. – Martin Y. Dec 01 '15 at 15:09
  • Every query is entirely separate, and yes there are threads behind it. I had considered whether the threading architecture was holding onto the allocation. However, later in this code **ikeyData[1]['value']** is discarded as I render out to JSON. This occurs at a layer where the threading architecture has no access to the list in question, or the mechanics of the ikeyData dict. From the perspective of the threading architecture, it calls a function which returns a string. – Martin Y. Dec 01 '15 at 15:43