0

I'm running a script which takes up a significant amount of memory as it operates on multiple large DataFrame objects. As I keep getting MemoryError exceptions, I figured I need to do a better job of cleaning my memory during the runtime, as technically there should be enough memory to hold all the relevant objects. However, it doesn't seem to be working.

In the following subsection of my code:

print("Before deleting df: {}".format(get_mem_use()))
del df
print("After deleting df: {}".format(get_mem_use()))

The output is:

Before deleting df: 185323520
After deleting df: 185323520

Where:

def get_mem_use():
    """Get the currently used rss memory."""
    process =  psutil.Process(os.getpid())
    return process.memory_info().rss

So deleting the dataframe doesn't seem to change anything in memory usage. Is it being deleted pre-emptively by the interpreter once it knows that the dataframe isn't referenced any more in my code, or am I deleting it wrong?

Marses
  • 1,464
  • 3
  • 23
  • 40
  • 1
    Possible duplicate of [How do I release memory used by a pandas dataframe?](https://stackoverflow.com/questions/39100971/how-do-i-release-memory-used-by-a-pandas-dataframe) – Austin Feb 13 '18 at 13:29
  • Thank you, that cleared a bit up, but not all. The idea of this high-watermark they mention is that the memory usage doesn't go down below that point right? But would this still lead to MemoryErrors? Is there absolutely nothing in python that will clean up some of the dead memory once you get close to full usage? And can the high watermark really reach close to full usage? – Marses Feb 13 '18 at 13:42

0 Answers0