0

I have a Python program which basically keeps list of Counter objects and then writes them to disk. After four days counting has finished but the system is almost out of 63GB memory and it has already swapped out 50GB and no progress.

Here is a simplified version of my code.

import os
import time
from collections import Counter

print(os.getpid())

counters = [Counter() for i in range(4)]

while True:
    for i in range(1024):
        for counter in counters:
            counter[i] = 1
    time.sleep(5)
    with open('/tmp/counter.txt', 'w') as f:
        for counter in counters:
            f.write('\n'.join(map(str, counter.most_common())))

I am guessing it is stuck at the last line and cannot sort the dict because it is OOM.

I need to safely write these Counter objects to disk for later processing

In other threads I found some answers but couldn't work it out. Here is what I tried so far:

  1. Attach gdb to my Python program: gdb python3 32610
  2. Show backtrace: bt
  3. Try to pickle.dump possible candidate for Counter object

    (gdb) bt
    ...
    #21 0x0000000000504c28 in PyEval_EvalFrameEx (throwflag=0,
    f=Frame 0x7f70a45d0cf8, for file /usr/lib/python3.6/collections/__init__.py, line 553, in most_common (self=<Counter at remote 0x7f70a2d7be60>, n=1048575)) at ../Python/ceval.c:4166
    ...
    (gdb) python i = gdb.inferiors()[0]
    (gdb) python m = i.read_memory(0x7f70a45d0cf8, 4)
    (gdb) python print(m.tobytes())
    b'\x02\x00\x00\x00'
    (gdb) python import pickle
    (gdb) python pickle.dump(m, open('/tmp/02.pickle', 'wb'))
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
    TypeError: can't pickle memoryview objects
    Error while executing Python code.
    (gdb) dump value 0x7f70a45d0cf8
    No value to dump.
    
  4. I am not sure how to find the start/end address of the objects I am interested in.

  5. Counter is not always shown in the backtrace. How I select the correct frame?
  6. Pickle won't pickle? I cannot install any of suggested cPickle or Garlicsim or meliae
DYZ
  • 55,249
  • 10
  • 64
  • 93
user1480788
  • 514
  • 4
  • 8
  • What exactly are you trying to use gdb to do and why? Any idea why you're running out of memory? The code in your question doesn't look like it would keep needing more and more. – martineau Feb 12 '19 at 07:55
  • gdb is a suggested solution I am trying to use to write dump objects to file. I am open for suggestions. This code is a simplified version. In real code Counters keep millions of keys. Regardless I need to save those key and count values to a file. – user1480788 Feb 12 '19 at 09:11
  • Do you want to dump the objects in the current process, so that the previous computation is not wasted? – Florian Weimer Feb 12 '19 at 09:39
  • Yes, I want to dump objects in the currently running process. so four days of work is not wasted. – user1480788 Feb 12 '19 at 10:27

1 Answers1

0

I kind of found a solution.

zcat /usr/share/doc/python3.6/gdbinit.gz > ~/.gdbinit

This gdbinit file has a macro called pyg which prints repr of objects to stderr.

pylocals in the same file calls this for every local found in the frame.

# gdb -p 12912
(gdb) bt
(gdb) f 21 # select relavant frame
(gdb) pylocals 

This prints smth like this to stderr of Python program's console

object  : [Counter({'foobar': 321}), Counter(), Counter(), Counter()]
type    : list
refcount: 1
address : 0x7f02578ea5c8

For my real code since the objects I am trying to dump are huge I suspended the Python program and redirected its output to a new file.

fg &> gdb.out.pylocals.txt

I imagine it will take some time.

In the mean time hope this helps to someone else too.

user1480788
  • 514
  • 4
  • 8