I've got a long-running Python 3 task on a Linux server that is being Killed
, which is apparently due to resource exhaustion. The task reads in a data set of about 5000 items, runs a for
loop over the list, and does some rather intensive processing on each element. It got killed before finishing the first 100. The processing data is not interdependent from one iteration of the loop to another, so theoretically everything should be garbage collectable between iterations, but somehow it's still getting killed for running out of memory.
Are there any useful techniques to get a memory profile of a Python 3 script and figure out where it's leaking memory, so I can debug this? Most of the tools I've looked up, such as the memory_profiler
library, seem to need you to already know where the problem is before it will help you find it, requiring you to annotate a known-bad function with a specific decorator. The problem is, this is a pretty big program, with over 40 .py scripts and a dozen 3rd party libraries, some of which are native code (numpy, scipy, etc.) and I don't even know where to start looking.
What's the best way to go about figuring out where the leaks exist?