I have a long-running CPython 3.8 process. After a while, it's using a huge amount of RAM. I have tried
- running gc.collect()
- using pympler to discover all known Python objects
import gc
import psutil
from pympler import muppy, summarize
gc.collect()
total_ram = psutil.Process().memory_info().rss
all_objects = muppy.get_objects(include_frames=True)
s = summary.summarize(all_objects)
python_objects_size = sum(row[2] for row in s)
Output: 102 MiB Python objects, 824 MiB RSS memory!
[EDIT] 3. using tracemalloc; which also returns ~100MiB worth of python objects
[EDIT 2] export PYTHONMALLOC=malloc
does not solve the problem.
Is there a way to query the CPython memory manager to figure out
- How much RAM it is holding, so that I can subtract it from the RSS and figure out if there is any C library that is not using PyMem_Malloc and is leaking
- Why it is holding the memory (e.g. find out that it's holding a 64kb page because of a single 20 bytes PyObject that's still being referenced)
- Identify C modules that invoked PyMem_Malloc and never released the memory afterwards
- Track the OS-level malloc() and free() and cross-compare them with the ones performed by pymalloc, to figure out if there's a C library that's allocating memory not with PyMem_Malloc
Related
- Calculating memory fragmentation in Python (2012; same question, but never answered)
- Does CPython's garbage collection do compaction? (answer: no)