4

When I initialize a Numpy array inside a function, Python does not free the memory after the function returns as shown in the code example below. Is there any way I can free this memory? Using gc.collect() did not work and the same problem also occurs in Python2 and Python3.

import numpy as np
import resource

def function():
    x = np.random.random([10000, 10000])

print('Memory usage: %s (kb)'% resource.getrusage(resource.RUSAGE_SELF).ru_maxrss)

function()

print('Memory usage: %s (kb)'% resource.getrusage(resource.RUSAGE_SELF).ru_maxrss)

Code output:
Memory usage: 20560 (kb)
Memory usage: 801832 (kb)

Ben N
  • 55
  • 1
  • 5
  • Possible duplicate of [How can I explicitly free memory in Python?](https://stackoverflow.com/questions/1316767/how-can-i-explicitly-free-memory-in-python) – ivan_pozdeev Apr 25 '18 at 00:35
  • Do you need to use the numpy array after the function at all? – Anish Shanbhag Apr 25 '18 at 00:37
  • I know there are duplicates somewhere. For now, read the following: http://effbot.org/pyfaq/why-doesnt-python-release-the-memory-when-i-delete-a-large-object.htm – juanpa.arrivillaga Apr 25 '18 at 01:10
  • Also note `gc` would be totally irrelevant here, there aren't any reference cycles, thus the memory is being managed by simple reference counting (in CPython, of course) – juanpa.arrivillaga Apr 25 '18 at 01:11

2 Answers2

3

Python is actually freeing the memory as soon as the function is done. The problem here is that the value you're printing out, resource.getrusage(resource.RUSAGE_SELF).ru_maxrss, tells you the peak or max memory usage.

To get the current memory usage, you might try the psutil package ($ pip install psutil). It's a cross platform utility for giving you info like the current memory usage.

Try this modified snippet:

import numpy as np
import resource
import os
import psutil

process = psutil.Process(os.getpid())

def my_function():
    print('### Starting function ###')
    print('Max  Memory usage: %s (KB)' % resource.getrusage(resource.RUSAGE_SELF).ru_maxrss)
    print('Curr Memory usage: %s (KB)' % (process.memory_info().rss / 1024))
    print('doing stuff...')
    x = np.random.random([10000, 10000])
    print('Max  Memory usage: %s (KB)' % resource.getrusage(resource.RUSAGE_SELF).ru_maxrss)
    print('Curr Memory usage: %s (KB)' % (process.memory_info().rss / 1024))
    print('#### Ending function ####')

print('Max  Memory usage: %s (KB)'% resource.getrusage(resource.RUSAGE_SELF).ru_maxrss)
print('Curr Memory usage: %s (KB)'% (process.memory_info().rss/ 1024))

my_function()

print('Max  Memory usage: %s (KB)'% resource.getrusage(resource.RUSAGE_SELF).ru_maxrss)
print('Curr Memory usage: %s (KB)'% (process.memory_info().rss / 1024))

On my machine, here's the output:

Max  Memory usage: 714588 (KB)
Curr Memory usage: 26884.0 (KB)
### Starting function ###
Max  Memory usage: 714588 (KB)
Curr Memory usage: 26884.0 (KB)
doing stuff...
Max  Memory usage: 808380 (KB)
Curr Memory usage: 808380.0 (KB)
#### Ending function ####
Max  Memory usage: 808380 (KB)
Curr Memory usage: 27132.0 (KB)

By the time we get to the first memory check, it's down to 26 MB, but it's already been as high as 714 MB at some point while starting up or importing libraries.

At the start of the function it's the same, but by the end of our function, we've hit a new high with help from numpy. At this point the current usage is the new max, so both values match.

After we leave the function, our current usage drops back down to roughly where it was before we entered the function.

WhiteHotLoveTiger
  • 2,088
  • 3
  • 30
  • 41
2

Python can choose not to release garbage collected memory to the OS. It may keep the already allocated memory for future use. It does not mean there is a memory leak.

kjp
  • 3,086
  • 22
  • 30