36

I have hundreds of really larges matrices, like (600, 800) or (3, 600, 800) shape'd ones.

Therefore I want to de-allocate the memory used as soon as I don't really need something anymore.

I thought:

some_matrix = None

Should do the job, or is just the reference set to None but somewhere in the Memory the space still allocated? (like preserving the allocated space for some re-initialization of some_matrix in the future)

Additionally: sometimes I am slicing through the matrices, calculated something and put the values into a buffer (a list, because it gets appended all the time). So setting a list to None will definitely free the memory, right?

Or does some kind of unset() method exist where whole identifiers plus its referenced objects are "deleted"?

daniel451
  • 10,626
  • 19
  • 67
  • 125

3 Answers3

35

You definitely want to have a look at the garbage collection. Unlike some programming language like C/C++ where the programmer has to free dynamically allocated memory by himself when the space is no longer needed, python has a garbage collection. Meaning that python itself frees the memory when necessary.

When you use some_matrix = None, you unlink the variable from the memory space; the reference counter is decreased, and if it reaches 0, the garbage collector will free the memory. When you use del some_matrix as suggested by MSeifert, the memory is not freed immediately as opposed to what the answer says. According to python doc, this is what happens:

Deletion of a name removes the binding of that name from the local or global namespace

What happened under the hood is that the counter of references to the memory space is reduced by 1 independently of assigning None or using del. When this counter reaches 0, the garbage collector will free the memory space in the future. The only difference is that when using del, it is clear from the context that you do not need the name anymore.

If you look at the doc of the garbage collection, you will see that you can invoke it by yourself or change some of its parameters.

Vitaly Olegovitch
  • 3,509
  • 6
  • 33
  • 49
innoSPG
  • 4,588
  • 1
  • 29
  • 42
  • 1
    `gx` is only the cyclic garbage collector, it won't have an effect here. When a reference count reaches `0` it is freed *immediately* in CPython, but that is not a language guarantee. For example, this is not the case in the Jython implementation, which uses Javas garbage collector – juanpa.arrivillaga Dec 28 '18 at 23:37
  • @juanpa.arrivillaga thanks for giving clarifications for some implementations. – innoSPG Jan 02 '19 at 14:16
18

Numpy deletes arrays when the reference counter is zero (or at least it keeps track of the reference counter and let's the OS collect the garbage).

For example having

import numpy as np
a = np.linspace(0,100, 10000000)
a = None

will free the memory "immediatly" (preferred way is writing del a though) while

import numpy as np
a = np.linspace(0,100, 10000000)
b = a
a = None

will free nothing.


You mentioned also slicing. Slicing is just a view on the data and therefore exactly like the second example. If you don't delete both variables that reference the same array the OS will keep the arrays.

If I do something very memory expensive I'll always stick with seperate functions that do the operation and only return what is really necessary. Functions clean up after themselves so any intermediate results are freed (If they are not returned).

MSeifert
  • 145,886
  • 38
  • 333
  • 352
  • Thanks for the clarification for functions/methods. I already assumed that local variables are thrown away after the return statement (like in Java or other languages) but it's nice to hear that Python really does the exact same thing there. However, `del some_array` or generally `del some_variable` should always be the first choice for explicitly freeing memory?! – daniel451 Feb 10 '16 at 13:56
  • ``del`` deletes the variable name, so any subsequent operation including the deleted variable will raise an Error. ``a=None`` on the other hand keeps the variable ``a`` so you "might" by accident use it later and not realize that it's actually "deleted". – MSeifert Feb 10 '16 at 14:01
  • 1
    Ok, thanks. However, the second scenario won't free anything with `del`, too?! Hence, `del` is deleting the identifier plus its reference but the object behind the reference is just deleted when no other identifier is referencing it anymore, right? So, as long as any other identifier references the object, like `b` in your example, the garbage collector will not delete the object?! – daniel451 Feb 10 '16 at 14:04
  • unfortunately, `del` does not actually free the memory. It only removes the binding of the variable name from the namespace. If `del` was to free the memory, it will be a big danger. If you have a big matrix `A` and define `B=A`, both `A` and `B` point to the same memory space. `del A` simply delete the name `A`, the space is still there. It is the garbage collector that free the memory. – innoSPG Feb 10 '16 at 14:05
  • 1
    Yes, I realize I was a bit vague with the explanation. Like @innoSPG said python only frees allocated memory if nothing is referencing it (in some way) anymore. I tried to make this distinction clear with the second example where something else was referencing it. There are good answers to the questions about garbage collection (i.e. http://stackoverflow.com/questions/14969739/python-del-statement) so I didn't include it in the answer. But if requested I could do so. – MSeifert Feb 10 '16 at 14:23
  • Thanks for the link. I upvoted your post and comment, highly welcome :) I guess adding more is not necessary, I will read your and InnoSPG's links. – daniel451 Feb 10 '16 at 14:28
  • if the separate function returns a *slice*, is it a view or is the sliced data copied? e.g. `def foo(): arr = np.random.random(100000); return arr[:3]`. And `x = foo()`. Does now `x` point to `arr`'s data blob? – kingusiu Nov 26 '20 at 16:58
  • 2
    Slicing just returns a view, so it will keep the whole memory of `arr` alive. In your case I would copy on return `return arr[:3].copy()` so that `arr` can be freed (at least if you want to return a small portion of intermediate arrays). – MSeifert Nov 26 '20 at 17:05
0

In case you have to do something like below memory won't be freed although a copy of a will be made implicitly:

a = np.ones((10000, 10000))
b = np.empty((10000, 10000))
b[:] = a
a = None
del a

Instead you can do the following and memory will be freed after doing a = None:

a = np.ones((10000, 10000))
b = np.empty((10000, 10000))
b[:] = np.copy(a)
a = None
del a
Amir
  • 10,600
  • 9
  • 48
  • 75