8

I have a function that generates a huge object (about 100-150Gb of memory, on a machine having 500Gb memory).

The function runs in about 1h, and writes a file to disk (about 100Mb).

But when the function ends, the program hangs there for several hours without doing anything (it doesn't continue instructions after the place where the function was called).

I suspect the garbage collector trying to delete the huge object created in this function, but I don't see anything happening (strace prints nothing), and the memory is not decreasing.

Do you have any idea of why this is happening and how to solve it ? I'm using python 3.5

cdancette
  • 81
  • 3
  • 2
    Try to [switch it off](https://docs.python.org/3/library/gc.html#gc.disable)! – Klaus D. Jan 25 '18 at 14:28
  • I tried, but it doesn't change anything. – cdancette Jan 26 '18 at 16:35
  • Have you tried https://docs.python.org/3/library/gc.html#gc.set_debug or some memory analyser like [asizeof](https://pypi.python.org/pypi/Pympler) (described at https://stackoverflow.com/questions/552744/how-do-i-profile-memory-usage-in-python/33631986#33631986)? – serv-inc Dec 12 '18 at 03:27

1 Answers1

2

Certainly not an answer, but here is a thread from the Python Developers mailing list that describes some behavior that sounds like what you are experiencing (I have experienced it too). https://mail.python.org/pipermail/python-dev/2008-December/084450.html

Having dug through the thread a bit, some interesting things have popped out:

  • Many say blame this on swap being so slow, but the OP (of the thread) and my experience show that this is not the case.
  • Others blame it on garbage collection, which I think is part of the culprit. It seems that there is some implementation detail that involved freeing non-contiguous blocks of memory.
    • An example in the thread of this is garbage collecting a sorted list taking no time at all (1-2 seconds), but then when that same list is shuffled, taking an exorbitant amount of time.

One possible workaround is by deleting the dictionary while still keeping a reference to the objects that are in the dictionary. It is presented in this message (very near the end of the thread). https://mail.python.org/pipermail/python-dev/2008-December/084560.html

Unfortunately, from the thread I haven't been able to see a clear solution to it, but hopefully this helps shed some light on what is going on!

colelemonz
  • 1,229
  • 1
  • 15
  • 27