3

I am getting "high water mark" memory leak behavior when I run:

import gc
temp = [[0.1] for _ in xrange(10 ** 7)]
del temp
gc.collect()

The resident memory starts at ~7 MB climbs to ~1000 MB and then settles at ~312MB. Subsequent runs do not increase the memory leak above 312 MB. Why does this happen and are there any known work arounds?

Various observations:

  1. It happens on Ubuntu 14.04 but not on OSX
  2. It does not happen in python3
  3. [[] for _ in xrange(10 ** 7)] does not leak
  4. [0.1 for _ in xrange(10 ** 7)] does not leak
  5. [(0.1,) for _ in xrange(10 ** 7)] does not leak
  6. [0.1 for _ in xrange(10 ** 7)] does not leak
  7. {random.random(): {0.1: 0.1} for _ in xrange(10 ** 7)} does leak
  8. Clearing the individual lists one at a time doesn't help
  9. Running in the python shell vs in a file doesn't seem to have an impact
  10. I have reproduced the behavior in python versions: 2.7.15, 2.7.14, 2.7.11, and 2.7.5

My first intuition was that it's caused by arenas not getting cleaned up. But that didn't make sense because I would expect the same behavior with [0.1 for _ in xrange(10 ** 7)] but that doesn't happen.

Why does nesting the list/dictionary result in this high water mark behavior?

I am primarily measuring the resident memory using htop

Tim Martin
  • 2,419
  • 1
  • 21
  • 25

0 Answers0