4

I have a Python process that starts leaking memory after a long time (at least 10 hours, sometimes more). The issue is difficult to reproduce, therefore I would like to attach to the running Python Interpreter when the problem comes up and inspect memory usage somehow, e.g. by getting a list of objects that currently allocate the most memory.

This is difficult with the usual profiling tools like tracemalloc or memory-profiler, because they need to be part of the code or started together with the process, and they have a significant impact on runtime performance.

What I would like to have is a sampling profiler that I can simply attach to an existing Python process like py-spy, but py-spy only gives me insights on CPU time spent in functions, not memory usage.

Is there another tool or a different approach that would help me to get insights into the memory usage of an existing Python process?

edit: I just found pyrasite, which provides the pyrasite-memory-viewer command, which is exactly what I'm looking for, but unfortunately the project seems to be abandoned and I can't get it to work on Python 3.8.

klamann
  • 1,697
  • 17
  • 28
  • this may help https://stackoverflow.com/questions/552744/how-do-i-profile-memory-usage-in-python – sahasrara62 Dec 13 '21 at 11:06
  • @sahasrara62 sorry, that's not what I'm looking for. All the answers are about profiling tools that need to be integrated into the application or at least the application needs to be launched by the profiler. I'm looking for a way to inspect memory usage during runtime, without any previous integration of a profiler. – klamann Dec 13 '21 at 11:11
  • pycharm has some profiler but don't know if they support during runtime – sahasrara62 Dec 13 '21 at 11:25
  • note that the python you install pyrasite on doesn't need to be the same python you inspect, you might still get it to work using cross python versions – Maarten Derickx Dec 27 '21 at 23:35

1 Answers1

1

https://github.com/vmware/chap (open source code) does what you are requesting here, as long as you can run your application on Linux.

  1. Wait until you believe you are interested in what allocations the application has at that point in time.

  2. Gather a live core (for example using gcore) for your process, but make sure the coredump filter is set first as in:

    echo 0x37 >/proc/pid-of-your-python-program/coredump_filter

    gcore pid-of-your-python-program

  3. Open the resulting core in chap.

  4. Do the following from the chap prompt:

    redirect on

    describe used

  5. Edit the resulting file or post-process it using the tool of choice.

The resulting files will have entries for each allocation currently in use. For the ones that correspond to python objects they will generally show the python type, as in:

Anchored allocation at 7f5e7bf2a570 of size 40
This allocation matches pattern ContainerPythonObject.
This has a PyGC_Head at the start so the real PyObject is at offset 0x18.
This has reference count 1 and python type 0x7f5e824c08a0 (dict)

Anchored allocation at 7f5e7bf2a5b0 of size 40
This allocation matches pattern SimplePythonObject.
This has reference count 3 and python type 0x7f5e824cdfe0 (str)
This has a string of length 12 containing
"ETOOMANYREFS".

There are other commands you can use from chap to understand why any allocations of interest to you are anchored, basically because they allow you to traverse backwards by incoming references, but the above should suffice to allow you to figure out which kinds of allocations have high counts.

Suppose, for example, you wanted to understand how that dict in the allocation at 0x7f5e7bf2a570 was referenced. You could do this command:

chap> describe incoming 7f5e7bf2a570 /skipUnfavoredReferences true
Anchored allocation at 17fda90 of size 1a8
This allocation matches pattern PyDictKeysObject.

1 allocations use 0x1a8 (424) bytes.

You could in turn ask what are the references to that PyDictKeysObject (not a python type but used to store the keys for a dictionary)

chap> describe incoming 17fda90 /skipUnfavoredReferences true
Anchored allocation at 7f5e7c0e01f0 of size 40
This allocation matches pattern ContainerPythonObject.
The garbage collector considers this allocation to be reachable.
This has a PyGC_Head at the start so the real PyObject is at offset 0x18.
This has reference count 1 and python type 0x7f5e824c08a0 (dict)

1 allocations use 0x40 (64) bytes.
Tim Boddy
  • 1,019
  • 7
  • 13
  • Not sure if I understand this correctly, but chap will give me a bunch of memory addresses, their allocation size, references, etc. But how can I figure out, what Python objects map to these addresses? What would I do next when chap tells me that there is some data allocated at 0x42? – klamann Dec 14 '21 at 08:45
  • I'll modify the answer to add an example of part of the output of "describe used". – Tim Boddy Dec 14 '21 at 16:01
  • thanks for providing the example, I think I understand now what chap can do for me. However, I don't think I can find the root cause of my problem with this approach, since chap can only identify a few basic Python types, but not what class or which package they belong to. Knowing that there is a dict somewhere with a huge memory allocation is not very helpful, since in Python every object is basically a dict. – klamann Dec 14 '21 at 17:13
  • Usually one can tell how a dict is used by following incoming references back to something like a frame or a module. – Tim Boddy Dec 14 '21 at 18:05