7

This bit stung me recently. I solved it by removing all comparisons of numpy arrays with lists from the code. But why does the garbage collector miss to collect it?

Run this and watch it eat your memory:

import numpy as np
r = np.random.rand(2)   
l = []
while True:
    r == l

Running on 64bit Ubuntu 10.04, virtualenv 1.7.2, Python 2.7.3, Numpy 1.6.2

Saullo G. P. Castro
  • 56,802
  • 26
  • 179
  • 234
Hauke
  • 2,554
  • 4
  • 26
  • 29

2 Answers2

5

Just in case someone stumbles on this and wonders...

@Dugal yes, I believe this is a memory leak in current numpy versions (Sept. 2012) that occurs when some Exceptions are raised (see this and this). Why adding the gc call that @BiRico did "fixes" it seems weird to me, though it must be done right after appearently? Maybe its an oddity with how python garbage collects tracebacks, if someone knows the Exception handling and garbage colleciton CPython Internals, I would be interested.

Workaround: This is not directly related to lists, but for example most broadcasting Exceptions (the empty list does not fit to the arrays size, an empty array results in the same leak. Note that internally there is an Exception prepared that never surfaces). So as a workaround, you should probably just check first if the shape is correct (if you do it a lot, otherwise I wouldn't worry really, this leaks just a small string if I got it right).

FIXED: This issue will be fixed with numpy 1.7.

seberg
  • 8,785
  • 2
  • 31
  • 30
0

Sorry I cannot give a more complete answer, but this seems to have something to do with garbage collection. I was able to recreate this issue using python 2.7.2, numpy 1.6.1 on Redhat 5.8. However when I tried the following, memory usage went back to normal.

import gc
import numpy as np
r = np.random.rand(2)   
l = []
while True:
    r == l
    gc.collect()
Bi Rico
  • 25,283
  • 3
  • 52
  • 75