So I know this has been addressed a number of times, see
to name a few. Here are the results I'm getting.
First, here's the code I'm running (I've removed all of the memory/debugging stuff for ease of reading)
import gc
def temp(a,b):
print "Running Code -", (a,b)
c = a * b
d = [a**c for i in range(100000)]
del d
gc.collect()
print "Done"
return c
temp(2,3)
temp(3,4)
In terms of memory, here's how I know the memory is leaking:
memory use: 11.61328125 mb
Running Code - (2, 3)
Done
memory use: 14.234375 mb
Running Code - (3, 4)
Done
memory use: 17.83984375 mb
For this test, I added the following:
import gc
import os,psutil
...
memoryUse = py.memory_info()[0]/2.**20
print 'memory use:',memoryUse,'mb'
temp(2,3)
memoryUse = py.memory_info()[0]/2.**20
print 'memory use:',memoryUse,'mb'
temp(3,4)
memoryUse = py.memory_info()[0]/2.**20
print 'memory use:',memoryUse,'mb'
So I tried to figure out where the memory is leaking. First I tried using memory_profiler and got the following:
Running Code - (2, 3)
Done
Filename: temp.py
Line # Mem usage Increment Line Contents
================================================
20 12.3 MiB 12.3 MiB @profile
21 def temp(a,b):
22 12.3 MiB 0.0 MiB print "Running Code -", (a,b)
23 12.3 MiB 0.0 MiB c = a * b
24 16.4 MiB 3.2 MiB d = [a**c for i in range(100000)]
25 15.0 MiB 0.0 MiB del d
26 15.0 MiB 0.0 MiB gc.collect()
27 15.0 MiB 0.0 MiB print "Done"
28 15.0 MiB 0.0 MiB return c
Running Code - (3, 4)
Done
Filename: temp.py
Line # Mem usage Increment Line Contents
================================================
20 15.0 MiB 15.0 MiB @profile
21 def temp(a,b):
22 15.0 MiB 0.0 MiB print "Running Code -", (a,b)
23 15.0 MiB 0.0 MiB c = a * b
24 18.7 MiB 0.5 MiB d = [a**c for i in range(100000)]
25 18.7 MiB 0.0 MiB del d
26 18.7 MiB 0.0 MiB gc.collect()
27 18.7 MiB 0.0 MiB print "Done"
28 18.7 MiB 0.0 MiB return c
This again is showing that del and gc.collect are not doing anything. Here's the code for this version:
from memory_profiler import profile
import gc
@profile
def temp(a,b):
...
temp(2,3)
temp(3,4)
So I tried using another program to see if I can notice what's happening. I tried pympler which gave me the following:
Running Code - (2, 3)
Done
types | # objects | total size
======================= | =========== | ============
list | 2678 | 274.16 KB
str | 2679 | 152.41 KB
int | 276 | 6.47 KB
dict | 2 | 2.05 KB
wrapper_descriptor | 9 | 720 B
getset_descriptor | 4 | 288 B
weakref | 3 | 264 B
member_descriptor | 3 | 216 B
code | 1 | 128 B
function (temp) | 1 | 120 B
function (store_info) | 1 | 120 B
cell | 2 | 112 B
instancemethod | -1 | -80 B
tuple | -1 | -104 B
Running Code - (3, 4)
Done
types | # objects | total size
======= | =========== | ============
list | 2 | 192 B
str | 3 | 149 B
Here's the code I used for thhis:
from pympler import tracker
import gc
tr = tracker.SummaryTracker()
...
temp(2,3)
tr.print_diff()
temp(3,4)
tr.print_diff()
So it looks like it definitely not deleting the lists at all. BUT here's the kicker, when I try two other methods, it's showing that they are deleted for whatever reason. Here is pympler's muppy call:
types | # objects | total size
============================ | =========== | ============
str | 11341 | 2.18 MB
dict | 1037 | 1.82 MB
code | 3642 | 455.25 KB
type | 331 | 293.89 KB
wrapper_descriptor | 2271 | 177.42 KB
list | 508 | 88.34 KB
builtin_function_or_method | 1062 | 74.67 KB
tuple | 959 | 71.62 KB
weakref | 803 | 69.01 KB
method_descriptor | 913 | 64.20 KB
set | 212 | 59.03 KB
getset_descriptor | 578 | 40.64 KB
StgDict | 37 | 36.32 KB
<class 'abc.ABCMeta | 32 | 28.25 KB
_ctypes.PyCSimpleType | 27 | 23.84 KB
Running Code - (2, 3)
Done
types | # objects | total size
============================ | =========== | ============
str | 11340 | 2.18 MB
dict | 1039 | 1.82 MB
code | 3641 | 455.12 KB
type | 331 | 293.89 KB
wrapper_descriptor | 2280 | 178.12 KB
list | 508 | 88.34 KB
builtin_function_or_method | 1062 | 74.67 KB
tuple | 958 | 71.51 KB
weakref | 806 | 69.27 KB
method_descriptor | 913 | 64.20 KB
set | 212 | 59.03 KB
getset_descriptor | 582 | 40.92 KB
StgDict | 37 | 36.32 KB
<class 'abc.ABCMeta | 32 | 28.25 KB
_ctypes.PyCSimpleType | 27 | 23.84 KB
Running Code - (3, 4)
Done
types | # objects | total size
============================ | =========== | ============
str | 11340 | 2.18 MB
dict | 1039 | 1.82 MB
code | 3641 | 455.12 KB
type | 331 | 293.89 KB
wrapper_descriptor | 2280 | 178.12 KB
list | 508 | 88.34 KB
builtin_function_or_method | 1062 | 74.67 KB
tuple | 958 | 71.51 KB
weakref | 806 | 69.27 KB
method_descriptor | 913 | 64.20 KB
set | 212 | 59.03 KB
getset_descriptor | 582 | 40.92 KB
StgDict | 37 | 36.32 KB
<class 'abc.ABCMeta | 32 | 28.25 KB
_ctypes.PyCSimpleType | 27 | 23.84 KB
If you notice, the size and number of each object stays basically the same! Here's the code.
from pympler import muppy
import gc
...
muppy.print_summary()
temp(2,3)
muppy.print_summary()
temp(3,4)
muppy.print_summary()
So finally, I decided to look at the actual objects themselves. In this case I'll show the code first, and then I'll show the results.
import gc
from collections import defaultdict
from gc import get_objects
initObjs = defaultdict(int)
for i in get_objects():
initObjs[type(i)] += 1
...
temp(2,3)
firstObjs = defaultdict(int)
for i in get_objects():
firstObjs[type(i)] += 1
temp(3,4)
secObjs = defaultdict(int)
for i in get_objects():
secObjs[type(i)] += 1
print [(k,firstObjs[k] - initObjs[k], secObjs[k] - firstObjs[k], secObjs[k] - initObjs[k]) for k in firstObjs if firstObjs[k] - initObjs[k]]
And now the results
Running Code - (2, 3)
Done
Running Code - (3, 4)
Done
[(<type 'function'>, 1, 0, 1), (<type 'tuple'>, -122, 0, -122), (<type 'dict'>, -2, 0, -2), (<type 'collections.defaultdict'>, 1, 1, 2)]
Notice that the only thing that is really being added here is the function. Which makes it seem like, variables etc. should all have been done away with.
So that's basically all I have. Anyone have any idea why my memory is not being deleted? Note that I tried to not call gc.collect()
and the memory leak still happened. I even tried not doing del d
and still the memory leak continued. How do I get rid of d
from my memory?
It should be noted that I'm using python2.7. (This is because I'm trying to program something on an application that uses python2.7 and I'm seeing a memory leak there and created this mwo on my machine to try and fix it, so I can't just switch to python 3. The application is in process of being moved to python 3, but that's gonna take a long time)
Also, I'm running Ubuntu 18.04 if that means anything.
EDIT
I was asked to run the script multiple times to see what happens. Here's the loop part:
i = 1
j = 2
for k in range(10):
i += 1
j += 1
temp(i,j)
memoryUse = py.memory_info()[0]/2.**20
print 'memory use:',memoryUse,'mb'
And here's the memory usage:
memory use: 11.62890625 mb
Running Code - (2, 3)
Done
memory use: 14.35546875 mb
Running Code - (3, 4)
Done
memory use: 17.9609375 mb
Running Code - (4, 5)
Done
memory use: 18.734375 mb
Running Code - (5, 6)
Done
memory use: 15.171875 mb
Running Code - (6, 7)
Done
memory use: 15.81640625 mb
Running Code - (7, 8)
Done
memory use: 15.81640625 mb
Running Code - (8, 9)
Done
memory use: 15.81640625 mb
Running Code - (9, 10)
Done
memory use: 16.51953125 mb
Running Code - (10, 11)
Done
memory use: 16.51953125 mb
Running Code - (11, 12)
Done
memory use: 16.51953125 mb
Edit 2
I've been asked to change the code to:
def temp(a):
print "Running Code -", a
d = [0] * 1000000
del d
gc.collect()
print "Done"
return a
memoryUse = py.memory_info()[0]/2.**20
print 'memory use:',memoryUse,'mb'
i = 1
for k in range(10):
i += 1
temp(i)
memoryUse = py.memory_info()[0]/2.**20
print 'memory use:',memoryUse,'mb'
This resultsin:
memory use: 11.48828125 mb
Running Code - 2
Done
memory use: 11.67578125 mb
Running Code - 3
Done
memory use: 19.1484375 mb
Running Code - 4
Done
memory use: 19.1484375 mb
Running Code - 5
Done
memory use: 19.1484375 mb
Running Code - 6
Done
memory use: 19.1484375 mb
Running Code - 7
Done
memory use: 19.1484375 mb
Running Code - 8
Done
memory use: 19.1484375 mb
Running Code - 9
Done
memory use: 19.1484375 mb
Running Code - 10
Done
memory use: 19.1484375 mb
Running Code - 11
Done
memory use: 19.1484375 mb
When I then increase 1000000
to 10000000
(I added an extra 0), then the garbage collector seems to work.
memory use: 11.34765625 mb
Running Code - 2
Done
memory use: 11.53125 mb
Running Code - 3
Done
memory use: 11.53125 mb
Running Code - 4
Done
memory use: 11.53125 mb
Running Code - 5
Done
memory use: 11.53125 mb
Running Code - 6
Done
memory use: 11.53125 mb
Running Code - 7
Done
memory use: 11.53125 mb
Running Code - 8
Done
memory use: 11.53125 mb
Running Code - 9
Done
memory use: 11.53125 mb
Running Code - 10
Done
memory use: 11.53125 mb
Running Code - 11
Done
memory use: 11.53125 mb
So then the question kind of morphs into:
- Why does the garbage collector work for large things but not small?
- How can I force it to work for smaller objects such as above?