-1

Setup:

I am running a python code where:

  1. I open a file.
  2. For every line in file, I create an object
  3. Do some operations with the object

Note that once I am done with the operations part, I no longer need the object. Every new line is independent.

Relevant Code as per request:

I have commented all the parts of my code, leaving below the following code:

import gc
for l in range(num_lines):
    inp = f.readline()[:-1]
    collector = [int(i) for i in inp]
    M = BooleanFunction(collector)
    deg = M.algebraic_degree()
    del M
    gc.collect()

The problem:

The object once created, is consuming some amount of memory. After performing the operations, I am not able to free it. So while looping over the file, my memory keeps getting accumulated with new objects, and by around 793 lines into the file, my 16 GB of RAM is completely depleted.

What I have tried:

Using the garbage collector:

import gc del Object gc.collect()

However, the garbage collector will not free up the RAM (or) python is not giving up the memory to the system. Creating child-processes is an idea, but not what I am up for.

Questions:

  1. Is there any way I can free up all the memory currently occupied by the program to the OS? That means removing all variables (loop vars, global vars, etc). Something similar to what happens when you press CTRL+C to terminate the program, it returns all the memory to the OS.
  2. A way to specifically de-allocate an object (If I am not doing it right).

Previous questions do not answer what if gc.collect() fails to do so and how do I completely give up the memory allocated.

  • Possible duplicate of [How can I explicitly free memory in Python?](https://stackoverflow.com/q/1316767/608639), [Releasing memory in Python](https://stackoverflow.com/q/15455048/608639), etc. Also see [Does calling free or delete ever release memory back to the “system”](https://stackoverflow.com/q/1421491/608639) since Python is written in C. – jww Jan 20 '19 at 07:57
  • I have read this post, doesn't answer my question or solve it. – Akhilesh Siddhanti Jan 20 '19 at 07:58
  • 2
    Can you show your code? The objects’ implementation might have a bug (holding onto non-Python resources), or you might be keeping references to them by accident, or… – Ry- Jan 20 '19 at 08:01
  • 1
    Pseudo code is useless when the details matter. I think you should focus on the problem of the memory leak rather than garbage collection which (as you have found) can’t fix a leak. If there is a cpython extension that should be your first focus. – DisappointedByUnaccountableMod Jan 20 '19 at 10:18

1 Answers1

0

Objects in Python can be garbage-colleted once their reference count drops to zero.

Looking at your code, every variable gets re-assigned in every iteration. So their reference count should be zero.

If that doesn't happen then I can see three main possibilities;

  1. You are unwittingly keeping a reference to that object.
  2. Garbage collection is disabled (gc.disable()) or frozen (gc.freeze() in Python 3.7).
  3. The objects are made by a Python extension written in C that manages its own memory.

Note that (1) or (2) doesn't have to happen in your code. It can also happen in modules that you use.

In your case (2) should not be an issue since you force garbage collection.

For an example of (1), consider what would happen if BooleanFunction was memoized. Then a reference to each object (that you wouldn't see and can't delete) would be kept.

The only way to give all memory back to the OS is to terminate the program.

Edit 1:

Try running your program with the garbage collection debug flags enabled (gc.DEBUG_LEAK). Run gc.get_count() at the end of every loop. And maybe gc.garbage() as well.

For a better understanding of where the memory allocation happens and what exactly happens, you could run your script under the Python debugger. Step through the program line by line while monitoring the resident set size of the Python process with ps in another terminal.

Roland Smith
  • 42,427
  • 3
  • 64
  • 94
  • Regarding the three possibilities: 1) I am sure I am not, however the underlying class BooleanFunction allocates some space for generating algebraic_degree(), and this uses up memory. Now, in the next iteration l=1, a new memory is allocated and used to calculate the degree again. When I say M = BooleanFunction() I am actually only assigning the newly declared memory the same label from previous iteration. The old memory cannot be referenced anymore. gc.collect() cannot collect it (I am deleting all items in gc.garbage in every iteration after setting gc.set_debug(gc.DEBUG_SAVEALL)). – Akhilesh Siddhanti Jan 20 '19 at 10:02
  • 2) cannot happen, but a combination of (1) and (3) is surely happening. gc.get_count() prints (0,0,0) every loop – Akhilesh Siddhanti Jan 20 '19 at 10:04
  • @AkhileshSiddhanti Try the python debugger. See updated answer. – Roland Smith Jan 20 '19 at 10:59