I have a Python program which basically keeps list
of Counter
objects and then writes them to disk. After four days counting has finished but the system is almost out of 63GB memory and it has already swapped out 50GB and no progress.
Here is a simplified version of my code.
import os
import time
from collections import Counter
print(os.getpid())
counters = [Counter() for i in range(4)]
while True:
for i in range(1024):
for counter in counters:
counter[i] = 1
time.sleep(5)
with open('/tmp/counter.txt', 'w') as f:
for counter in counters:
f.write('\n'.join(map(str, counter.most_common())))
I am guessing it is stuck at the last line and cannot sort the dict because it is OOM.
I need to safely write these Counter
objects to disk for later processing
In other threads I found some answers but couldn't work it out. Here is what I tried so far:
- Attach gdb to my Python program: gdb python3 32610
- Show backtrace: bt
Try to pickle.dump possible candidate for Counter object
(gdb) bt ... #21 0x0000000000504c28 in PyEval_EvalFrameEx (throwflag=0, f=Frame 0x7f70a45d0cf8, for file /usr/lib/python3.6/collections/__init__.py, line 553, in most_common (self=<Counter at remote 0x7f70a2d7be60>, n=1048575)) at ../Python/ceval.c:4166 ... (gdb) python i = gdb.inferiors()[0] (gdb) python m = i.read_memory(0x7f70a45d0cf8, 4) (gdb) python print(m.tobytes()) b'\x02\x00\x00\x00' (gdb) python import pickle (gdb) python pickle.dump(m, open('/tmp/02.pickle', 'wb')) Traceback (most recent call last): File "<string>", line 1, in <module> TypeError: can't pickle memoryview objects Error while executing Python code. (gdb) dump value 0x7f70a45d0cf8 No value to dump.
I am not sure how to find the start/end address of the objects I am interested in.
- Counter is not always shown in the backtrace. How I select the correct frame?
- Pickle won't pickle? I cannot install any of suggested cPickle or Garlicsim or meliae