0

I have a large body of code, that uses some custom classes and a lot of dictionaries. Many of the classes get dictionaries added as attributes. I am finding that it uses too much memory, especially if I am looping -- even if I manually delete some of the classes and dicts.

What I fear is happening is that the dictionaries are getting deleted, but the objects which they contain persist. I need to refactor the code for better memory management, but as a quick solution I was hoping I could recursively and aggresively delete the dictionaries. How would this be achieved?

Here is an example...

def get_lrg():
    return {1: np.zeros((1000,1000,100))}

class H():
    def add_lrg(self):
        fd = get_lrg()
        self.lrg = fd

for cls in ['a', 'b', 'c', 'd']:
    exec('{0} = H()'.format(cls) )
    exec('{0}.add_lrg()'.format(cls) )

del a
del b
del c
del d

also, play around in Ipython with this:

fd = get_lrg()
fd2 = get_lrg()
F = {1: fd, 2: fd2}
F = {}
F = {1: fd, 2: fd2}
del F[1]

del F

and watch the memory usage of the python application... it doesn't appear to "release" the memory even after the dictionary "F" has been deleted (e.g. no references to the objects). What I find on my machine is that the results are unpredictable. Sometimes it does seem the memory gets flushed, other times it appears to be kept in use.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
John
  • 1,263
  • 2
  • 15
  • 26
  • Are you looking for something like this: http://stackoverflow.com/a/1073382/2391022 ? – JGibbers Aug 09 '13 at 13:08
  • Related: [Which Python memory profiler is recommended?](http://stackoverflow.com/questions/110259/which-python-memory-profiler-is-recommended) – Sven Marnach Aug 09 '13 at 13:11

2 Answers2

6

If the objects in the dictionaries live after you've gotten rid of the dictionaries then they're supposed to live because some code references them.

There's 2 ways that Python handles memory:

  1. Reference counting
  2. A mark-and-sweep garbage collector

When you delete the dictionary, you are removing a reference to the objects in question. If that's the last reference to those objects, they will be freed for you automatically.

The reference counting is not enough, however, if cycles between objects exists as this will lead to all objects in the cycle having at least 1 live reference even if no outside reference exists.

This is why there is also a garbage collector that cleans this up, albeit at a slightly later time. Reference counting takes care of objects when the references reach 0, the garbage collector comes into play a bit later.

So there's no need to recursively delete anything, simply delete the reference to the dictionary and let Python worry about the rest.

There's another question here on SO that gives a bit more detail: Why does python use both reference counting and mark-and-sweep for gc?.

You can verify this with the following pieces of code:

class X:
  def __del__(this):
    print("deleted")

  def __init__(this):
    print("constructed")

print("before")
x = X()
print("after")
del x
print("done")

This will show you that the __del__ method is executed as part of the del x statement.

Then you have this:

class X:
  def __del__(this):
    print("deleted")

  def __init__(this):
    print("constructed")

print("before")
x = X()
y = X()
x.y = y
y.x = x
print("after")
del x
del y
print("done")

This will show you that cycles (both x and y refer to each other) is handled differently.

Then you have this, where I stored the x into a dictionary, and then delete the dictionary, the x object is deleted together with the dictionary:

class X:
  def __del__(this):
    print("deleted")

  def __init__(this):
    print("constructed")

print("before")
d = {"x": X()}
print("after")
del d
print("done")
Community
  • 1
  • 1
Lasse V. Karlsen
  • 380,855
  • 102
  • 628
  • 825
  • So perhaps the delay is what is causing my problems. The issue is that I can't be certain (due to some poorly written code, no doubt) that all references are removed... – John Aug 09 '13 at 13:25
  • That is a different problem, and no doubt something you need to figure out, but it isn't because Python doesn't clean up objects being kept in dictionaries. There has to be something else keeping them alive if they won't die. In the second/middle code example I tried to provoke the garbage collector to take care of my objects, but it wouldn't. No doubt it will take care of it after a bit of time, but I didn't bother figure out how long or how to provoke it. – Lasse V. Karlsen Aug 09 '13 at 13:28
4

When you delete an object, you are merely removing the reference to that object. If that object's reference count drops to 0, it is removed from memory, taking with it any references that that object holds to other objects.

Dictionaries, for example, do not contain any objects. All they contain, are references to other objects. If you remove all references to a dictionary, it is automatically cleaned up, deleted, and all it's references are gone too. Any keys or values the dictionary referenced will see their reference count drop by 1; and they, in turn, will be cleaned up if that count drops to 0.

There is no need to recursively delete anything, therefor. If your objects are no longer referenced, they are cleaned up automatically.

Note that even when Python releases objects, the process memory usage does not necessarily follow. Operating Systems can and do keep memory assigned to a process to reduce memory churn; a process may need to grow memory usage again and unless there is an urgent need for the memory elsewhere, the allocation is retained for some time.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • I added some detail to show what I'm talking about, but in the meantime, there's been some further discussion... – John Aug 09 '13 at 13:24
  • @John: Do not conflate memory allocation with memory usage. Operating systems are free to leave memory allocated to processes even when they no longer need to have it. – Martijn Pieters Aug 09 '13 at 13:29