3

I have a file called main.py, which references another file Optimisers.py which only has functions in it and is used in a for loop in main.py. These functions have different optimisation functions in them.

This Optimisers.py then references two other similar files with only functions in them as well, which are in while loops. All of these files use numpy.

I believe that is because of the loops with functions calling on and creating arrays in numpy, which is leading to a memory overload. Therefore I cannot finish some optimisation algorithms, or cycle through all the possible coordinates I would like to.

How do I ensure removal of variables in numpy? As I understand it, numpy's C libraries complicate the standard Python process. What does the %reset array command (from the link below) do? And where should I implement it?

P.S. I've read "Releasing memory of huge numpy array in IPython", and gc.collect() does not work either.

Community
  • 1
  • 1
AER
  • 1,549
  • 19
  • 37
  • 1
    Can you post a minimal example that demonstrates the problem? Try to strip the actual optimization algorithm, and only show the code that creates the arrays in a loop, using them in a way that still causes the leak. – user4815162342 Jun 01 '14 at 08:23
  • 1
    `%reset` applies only to `Ipython` (`%` is part of the Ipthon 'magic' syntax), and deals with its cache of input/output values. Unless you are running in `Ipython` this is not relevant. – hpaulj Jun 01 '14 at 16:08

1 Answers1

8

When a numpy array is no longer referenced, it will be automatically freed by the GC. The C objects are wrapped in Python objects, so for you it should not matter how it's implemented.

Make sure that arrays are not referenced in global variables, since those stick around until overwritten or the program exits.

If you need to free an array from a local variable before it goes out of scope you can use del variablename (or just assign e.g. None), but that will not take care of any other references, just the one named.

For debugging where you are referencing an object, you can use gc.get_referrers(object).

P.S. I've read Releasing memory of huge numpy array in IPython and gc.collect() does not work either.

Unless you have cycles or have called gc.disable(), gc.collect() will not make the GC happen sooner.

otus
  • 5,572
  • 1
  • 34
  • 48
  • 1
    Good explanation. Also, using a memory profiler can help pointing where the memory is being taken; useful when one has no clue. – Davidmh Jun 01 '14 at 13:10
  • 1
    I've tried assigning all the arrays to `None` or using the `gc.collect()` and even the `del` operator. Sadly none work. I haven't tried doing it to all variables though. I'll try the `gc.get_referrers()` thank you. – AER Jun 02 '14 at 12:19
  • Is there an easy to use memory profiler available. I'm still unaware as to which variable is. I still need to post a stripped down version of the 1000 or so lines of code I have (excluding comments). – AER Jun 23 '14 at 04:20
  • I recommend [`memory_profiler`](https://pypi.python.org/pypi/memory_profiler), which gives you memory increments/decrements on a per-line basis – ali_m Mar 24 '15 at 00:16