2

So I know this has been addressed a number of times, see

to name a few. Here are the results I'm getting.

First, here's the code I'm running (I've removed all of the memory/debugging stuff for ease of reading)

import gc

def temp(a,b):
    print "Running Code -", (a,b)
    c = a * b 
    d = [a**c for i in range(100000)]
    del d
    gc.collect()
    print "Done"
    return c

temp(2,3)
temp(3,4)

In terms of memory, here's how I know the memory is leaking:

memory use: 11.61328125 mb
Running Code - (2, 3)
Done
memory use: 14.234375 mb
Running Code - (3, 4)
Done
memory use: 17.83984375 mb

For this test, I added the following:

import gc
import os,psutil

...

memoryUse = py.memory_info()[0]/2.**20
print 'memory use:',memoryUse,'mb'

temp(2,3)
memoryUse = py.memory_info()[0]/2.**20
print 'memory use:',memoryUse,'mb'

temp(3,4)
memoryUse = py.memory_info()[0]/2.**20
print 'memory use:',memoryUse,'mb'

So I tried to figure out where the memory is leaking. First I tried using memory_profiler and got the following:

Running Code - (2, 3)
Done
Filename: temp.py

Line #    Mem usage    Increment   Line Contents
================================================
    20     12.3 MiB     12.3 MiB   @profile
    21                             def temp(a,b):
    22     12.3 MiB      0.0 MiB       print "Running Code -", (a,b)
    23     12.3 MiB      0.0 MiB       c = a * b
    24     16.4 MiB      3.2 MiB       d = [a**c for i in range(100000)]
    25     15.0 MiB      0.0 MiB       del d
    26     15.0 MiB      0.0 MiB       gc.collect()
    27     15.0 MiB      0.0 MiB       print "Done"
    28     15.0 MiB      0.0 MiB       return c


Running Code - (3, 4)
Done
Filename: temp.py

Line #    Mem usage    Increment   Line Contents
================================================
    20     15.0 MiB     15.0 MiB   @profile
    21                             def temp(a,b):
    22     15.0 MiB      0.0 MiB       print "Running Code -", (a,b)
    23     15.0 MiB      0.0 MiB       c = a * b
    24     18.7 MiB      0.5 MiB       d = [a**c for i in range(100000)]
    25     18.7 MiB      0.0 MiB       del d
    26     18.7 MiB      0.0 MiB       gc.collect()
    27     18.7 MiB      0.0 MiB       print "Done"
    28     18.7 MiB      0.0 MiB       return c

This again is showing that del and gc.collect are not doing anything. Here's the code for this version:

from memory_profiler import profile
import gc

@profile
def temp(a,b):
    ...

temp(2,3)
temp(3,4)

So I tried using another program to see if I can notice what's happening. I tried pympler which gave me the following:

Running Code - (2, 3)
Done
                  types |   # objects |   total size
======================= | =========== | ============
                   list |        2678 |    274.16 KB
                    str |        2679 |    152.41 KB
                    int |         276 |      6.47 KB
                   dict |           2 |      2.05 KB
     wrapper_descriptor |           9 |    720     B
      getset_descriptor |           4 |    288     B
                weakref |           3 |    264     B
      member_descriptor |           3 |    216     B
                   code |           1 |    128     B
        function (temp) |           1 |    120     B
  function (store_info) |           1 |    120     B
                   cell |           2 |    112     B
         instancemethod |          -1 |    -80     B
                  tuple |          -1 |   -104     B
Running Code - (3, 4)
Done
  types |   # objects |   total size
======= | =========== | ============
   list |           2 |    192     B
    str |           3 |    149     B

Here's the code I used for thhis:

from pympler import tracker
import gc
tr = tracker.SummaryTracker()
...
temp(2,3)
tr.print_diff()
temp(3,4)
tr.print_diff()

So it looks like it definitely not deleting the lists at all. BUT here's the kicker, when I try two other methods, it's showing that they are deleted for whatever reason. Here is pympler's muppy call:

                       types |   # objects |   total size
============================ | =========== | ============
                         str |       11341 |      2.18 MB
                        dict |        1037 |      1.82 MB
                        code |        3642 |    455.25 KB
                        type |         331 |    293.89 KB
          wrapper_descriptor |        2271 |    177.42 KB
                        list |         508 |     88.34 KB
  builtin_function_or_method |        1062 |     74.67 KB
                       tuple |         959 |     71.62 KB
                     weakref |         803 |     69.01 KB
           method_descriptor |         913 |     64.20 KB
                         set |         212 |     59.03 KB
           getset_descriptor |         578 |     40.64 KB
                     StgDict |          37 |     36.32 KB
         <class 'abc.ABCMeta |          32 |     28.25 KB
       _ctypes.PyCSimpleType |          27 |     23.84 KB
Running Code - (2, 3)
Done
                       types |   # objects |   total size
============================ | =========== | ============
                         str |       11340 |      2.18 MB
                        dict |        1039 |      1.82 MB
                        code |        3641 |    455.12 KB
                        type |         331 |    293.89 KB
          wrapper_descriptor |        2280 |    178.12 KB
                        list |         508 |     88.34 KB
  builtin_function_or_method |        1062 |     74.67 KB
                       tuple |         958 |     71.51 KB
                     weakref |         806 |     69.27 KB
           method_descriptor |         913 |     64.20 KB
                         set |         212 |     59.03 KB
           getset_descriptor |         582 |     40.92 KB
                     StgDict |          37 |     36.32 KB
         <class 'abc.ABCMeta |          32 |     28.25 KB
       _ctypes.PyCSimpleType |          27 |     23.84 KB
Running Code - (3, 4)
Done
                       types |   # objects |   total size
============================ | =========== | ============
                         str |       11340 |      2.18 MB
                        dict |        1039 |      1.82 MB
                        code |        3641 |    455.12 KB
                        type |         331 |    293.89 KB
          wrapper_descriptor |        2280 |    178.12 KB
                        list |         508 |     88.34 KB
  builtin_function_or_method |        1062 |     74.67 KB
                       tuple |         958 |     71.51 KB
                     weakref |         806 |     69.27 KB
           method_descriptor |         913 |     64.20 KB
                         set |         212 |     59.03 KB
           getset_descriptor |         582 |     40.92 KB
                     StgDict |          37 |     36.32 KB
         <class 'abc.ABCMeta |          32 |     28.25 KB
       _ctypes.PyCSimpleType |          27 |     23.84 KB

If you notice, the size and number of each object stays basically the same! Here's the code.

from pympler import muppy
import gc
...
muppy.print_summary()
temp(2,3)
muppy.print_summary()
temp(3,4)
muppy.print_summary()

So finally, I decided to look at the actual objects themselves. In this case I'll show the code first, and then I'll show the results.

import gc
from collections import defaultdict
from gc import get_objects

initObjs = defaultdict(int)
for i in get_objects():
    initObjs[type(i)] += 1

...

temp(2,3)
firstObjs = defaultdict(int)
for i in get_objects():
    firstObjs[type(i)] += 1

temp(3,4)
secObjs = defaultdict(int)
for i in get_objects():
    secObjs[type(i)] += 1

print [(k,firstObjs[k] - initObjs[k], secObjs[k] - firstObjs[k], secObjs[k] - initObjs[k]) for k in firstObjs if firstObjs[k] - initObjs[k]]

And now the results

Running Code - (2, 3)
Done
Running Code - (3, 4)
Done
[(<type 'function'>, 1, 0, 1), (<type 'tuple'>, -122, 0, -122), (<type 'dict'>, -2, 0, -2), (<type 'collections.defaultdict'>, 1, 1, 2)]

Notice that the only thing that is really being added here is the function. Which makes it seem like, variables etc. should all have been done away with.

So that's basically all I have. Anyone have any idea why my memory is not being deleted? Note that I tried to not call gc.collect() and the memory leak still happened. I even tried not doing del d and still the memory leak continued. How do I get rid of d from my memory?

It should be noted that I'm using python2.7. (This is because I'm trying to program something on an application that uses python2.7 and I'm seeing a memory leak there and created this mwo on my machine to try and fix it, so I can't just switch to python 3. The application is in process of being moved to python 3, but that's gonna take a long time)

Also, I'm running Ubuntu 18.04 if that means anything.

EDIT

I was asked to run the script multiple times to see what happens. Here's the loop part:

i = 1
j = 2 
for k in range(10):
    i += 1
    j += 1
    temp(i,j)
    memoryUse = py.memory_info()[0]/2.**20
    print 'memory use:',memoryUse,'mb'

And here's the memory usage:

memory use: 11.62890625 mb
Running Code - (2, 3)
Done
memory use: 14.35546875 mb
Running Code - (3, 4)
Done
memory use: 17.9609375 mb
Running Code - (4, 5)
Done
memory use: 18.734375 mb
Running Code - (5, 6)
Done
memory use: 15.171875 mb
Running Code - (6, 7)
Done
memory use: 15.81640625 mb
Running Code - (7, 8)
Done
memory use: 15.81640625 mb
Running Code - (8, 9)
Done
memory use: 15.81640625 mb
Running Code - (9, 10)
Done
memory use: 16.51953125 mb
Running Code - (10, 11)
Done
memory use: 16.51953125 mb
Running Code - (11, 12)
Done
memory use: 16.51953125 mb

Edit 2

I've been asked to change the code to:

def temp(a):
    print "Running Code -", a
    d = [0] * 1000000
    del d
    gc.collect()
    print "Done"
    return a
memoryUse = py.memory_info()[0]/2.**20
print 'memory use:',memoryUse,'mb'

i = 1
for k in range(10):
    i += 1
    temp(i)
    memoryUse = py.memory_info()[0]/2.**20
    print 'memory use:',memoryUse,'mb'

This resultsin:

memory use: 11.48828125 mb
Running Code - 2
Done
memory use: 11.67578125 mb
Running Code - 3
Done
memory use: 19.1484375 mb
Running Code - 4
Done
memory use: 19.1484375 mb
Running Code - 5
Done
memory use: 19.1484375 mb
Running Code - 6
Done
memory use: 19.1484375 mb
Running Code - 7
Done
memory use: 19.1484375 mb
Running Code - 8
Done
memory use: 19.1484375 mb
Running Code - 9
Done
memory use: 19.1484375 mb
Running Code - 10
Done
memory use: 19.1484375 mb
Running Code - 11
Done
memory use: 19.1484375 mb

When I then increase 1000000 to 10000000 (I added an extra 0), then the garbage collector seems to work.

memory use: 11.34765625 mb
Running Code - 2
Done
memory use: 11.53125 mb
Running Code - 3
Done
memory use: 11.53125 mb
Running Code - 4
Done
memory use: 11.53125 mb
Running Code - 5
Done
memory use: 11.53125 mb
Running Code - 6
Done
memory use: 11.53125 mb
Running Code - 7
Done
memory use: 11.53125 mb
Running Code - 8
Done
memory use: 11.53125 mb
Running Code - 9
Done
memory use: 11.53125 mb
Running Code - 10
Done
memory use: 11.53125 mb
Running Code - 11
Done
memory use: 11.53125 mb

So then the question kind of morphs into:

  1. Why does the garbage collector work for large things but not small?
  2. How can I force it to work for smaller objects such as above?
Aram Papazian
  • 2,453
  • 6
  • 38
  • 45
  • Couldn't reproduce. Printed memory stats (*psutil*), and the memory drops right after *del*. Tested with *Python 2* and *Python 3*. – CristiFati Sep 12 '19 at 17:59
  • Could it be a problem with just my machine then? Are there settings that I need to check to make sure memory is not leaking in this way? – Aram Papazian Sep 13 '19 at 01:04
  • You should repeat the process forever. If the Python process will eat up all memory, then yes there is a problem. I tested on *Win* btw. – CristiFati Sep 13 '19 at 08:04
  • I'm on ubuntu. I repeated the process 10 times and you can see the memory usage results. It never gets down to sub-13mb. But I don't see how this helps? As above, the code I plan on using will only be run once and it contains one variable that gets way to big that I need to delete. The rest of my code fails because I run into memory issues because I can't delete this one variable. – Aram Papazian Sep 13 '19 at 13:12
  • Your loop should go to much higher values (e.g. 1000000). Also the function (*temp*) 2 arguments are only useless complexity, you could have `d = [0] * 100000` for example. – CristiFati Sep 13 '19 at 13:46
  • Ok, so I tried it with a higher number since the one you gave wasn't producing a high enough memory footprint. I'm gonna edit above in order to add the new info. As you can see if I try adding one 0, the memory garbage collector continues to have problems. When I add two 0s then it works perfectly. So the question still stands. How do I make the garbage collector work in ALL cases, not just when the variable is "too big" for whatever python seems to think is "too big" – Aram Papazian Sep 13 '19 at 15:34
  • Can you please share what you ended up doing? Were there any conclusions you managed to draw out of this? – Dragolis Jul 01 '22 at 11:23

0 Answers0