I'm working on a project where I temporarily store objects in lists. But it seems that even when all references to these objects are deleted, the memory isn't freed, which leads to very high RAM usage in my process.
I've tried to reproduce a simple example and the results are indeed very strange, I can't explain them.
Here's a simple bench.py
import psutil
import os
import gc
import sys
import numpy as np
mod = int(sys.argv[1])
process = psutil.Process(os.getpid())
print(f"Empty memory: {process.memory_info().rss/1e6:.1f}MB")
arrs = list()
for i in range(1000):
arr = np.arange(50000) * i
if i % mod == 0:
arrs.append(arr)
print(f"Memory before cleaning: {process.memory_info().rss/1e6:.1f}MB - {len(arrs)} elements in array")
del arrs
gc.collect()
print(f"Memory after cleaning: {process.memory_info().rss/1e6:.1f}MB")
If I run it with 1, 2, 3 and 500 arguments I get these results.
# 1
Empty memory: 33.3MB
Memory before cleaning: 435.4MB - 1000 elements in array
Memory after cleaning: 34.4MB
#2
Empty memory: 33.3MB
Memory before cleaning: 234.2MB - 500 elements in array
Memory after cleaning: 34.5MB
#3 - Why nothing is cleaned in that case ?
Empty memory: 33.3MB
Memory before cleaning: 167.8MB - 334 elements in array
Memory after cleaning: 167.8MB
#500
Empty memory: 33.3MB
Memory before cleaning: 35.4MB - 2 elements in array
Memory after cleaning: 34.4MB
It doesn't make any sense to me, why is the memory not cleaned the same way in the 3 cases ? Why the biggest array get the most efficient cleaning ?