1

For example i have a code that produces many integers.

import sys
import random
a = [random.randint(0, sys.maxint) for i in xrange(10000000)]

After running it i got VIRT 350M, RES 320M (view by htop).

Then i do:

del a

But memory still is VIRT 272M, RES 242M (before producing integers was VIRT 24M, RES 6M).

The pmap of a process say that there are to big pieces of [anon] memory.

Python 3.4 does not have such behavior: memory are frees when i delete list here!

What happens? Does python leave integers in memory?

Pavel Patrin
  • 1,630
  • 1
  • 19
  • 33

1 Answers1

1

Here's how I can duplicate it. If I start python 2.7, the interpreter uses about 4.5 MB of memory. (I'm quoting "Real Mem" values from the Mac OS X Activity Monitor.app).

>>> a = [random.randint(0, sys.maxint) for i in xrange(10000000)]

Now, memory usage is ~ 305.7 MB.

>>> del a

Removing a seems to have no effect on memory.

>>> import gc
>>> gc.collect()   # perform a full collection

Now, memory usage is 27.7 MB. Sometimes, the first call to collect() doesn't seem to do anything, but a second collect() call will clean things up.

But, this behavior is by design, Python isn't leaking. This old FAQ on effbot.org explains a bit more about what's happening:

“For speed”, Python maintains an internal free list for integer objects. Unfortunately, that free list is both immortal and unbounded in size. floats also use an immortal & unbounded free list.

Essentially, python is treating the integers as singletons, under the assumption that you might use them more than once.

Consider this:

# 4.5 MB    
>>> a = [object() for i in xrange(10000000)]
# 166.7 MB
>>> del a
# 9.1 MB

In this case, python it's pretty obvious that python is not keeping the objects around in memory, and removing a triggers a garbage collection which cleans everything up.

As I recall, python will actually keep low-valued integers in memory forever (0 - 1000 or so). This may explain why the gc.collect() call doesn't return as much memory as removing the list of objects.


I looked around through the PEPs a bit to figure out why Python3 is different. However, I didn't see anything obvious. If you really wanted to know, you could dig around in the source code.

Suffice to say in Python 3, it either the number-singleton behavior has changed, or the garbage collector got better.

Many things are better in Python 3.

Seth
  • 45,033
  • 10
  • 85
  • 120
  • I call gc.collect() many times, but memory is still occupied (Linux, Kubuntu 15.04, 3.19.0-25-generic, x86_64). – Pavel Patrin Aug 14 '15 at 10:40
  • 1
    Hi @Pavel - I tried on an Mint 17.2 (Ubuntu 14.04 - 3.13.0-37) machine with python 2.7.6. The test array uses VIRT 348M RES 316M. `del a` reduces this to VIRT 270M RES 240M. So for me, some memory is freed, possibly just the list. I don't see any [obvious bugs](https://hg.python.org/cpython/raw-file/v3.4.3/Misc/NEWS) that seem to be related to this, so it's _possible_ that this is a new bug. However, if this problem is more than a curiosity for you and is holding you up, I'd suggest using a generator instead of a list, or if you really need a list, do your work in a child process. – Seth Aug 14 '15 at 16:20
  • Seth, it is not a "critical bug" for my code, but it is very interesting. – Pavel Patrin Aug 15 '15 at 19:46