1

Recently, I've got confused about memory management of python. First is about dict, say I have a composite dict object like

d = {id1: {'x': 'a', 'y': [1,2,3], 'z': {'k', 'v'}}, id2: {...}}

if I call del,

del d[id1]

Will the d[id1]['y'] and d[id1]['z'] be reclaimed together?

Second is about list, I read the answers from here, so I tried it. Here is my code

import sys
import gc
import time
from collections import defaultdict 
from pprint import pprint 

def f():
    d = defaultdict(int) 
    objects = gc.get_objects() 
    for o in objects: 
        d[type(o)] += 1
    x = d.items()
    x = sorted(x, key=lambda i: i[1], reverse=True)
    pprint(x[:5]) 

def loop():
    while True:
        leaked = [[x] for x in range(100)]
        f()
        time.sleep(0.1)

when the range is 100, well, the function f indeed showed me the list was increasing, but when I modify the range to 1000, there is nothing to change, the amount of list keep the same. Anybody could tell me what's the problem?

Community
  • 1
  • 1

2 Answers2

5

del removes that reference to the object in the current namespace. In Cpython, when an object's reference count reaches 0, it will be available for python to use for future objects (It doesn't necessarily go back to the OS).

consider:

a = []
b = a
del a #The list doesn't get freed because `b` is still a reference to that list

In your scenario, when you del d[id1], you remove a reference to that (inner) dictionary. Since it is holding a bunch of references to other objects, each of those objects now have 1 less reference. If their reference count reaches 0, they'll be collected and every object they hold a reference to will have its refcount decremented and so on.

mgilson
  • 300,191
  • 65
  • 633
  • 696
1

"Will the d[id1]['y'] and d[id1]['z'] be reclaimed together?"

Assuming nothing else references that dictionary or the contents of it, then it all goes to a ref count 0 at the same time. however, there's no guarantee that any of that wil lbe immediately collected.

"....Anybody could tell me what's the problem?"

Python caches the low integer objects, so they will always be referenced:

http://docs.python.org/2/c-api/int.html - "The current implementation keeps an array of integer objects for all integers between -5 and 256, when you create an int in that range you actually just get back a reference to the existing object."

That may explain the behaviour you're seeing. Instead of using x in range(100), create anonymous objects, e.g.

leaked = [object() for x in range(100)]
Tom Dalton
  • 6,122
  • 24
  • 35