Garbage collector tries to collect shared memory object

Question

I've got two Python scripts that both should do essentially the same thing: grab a large object in memory, then fork a bunch of children. The first script uses bare os.fork:

import time
import signal
import os
import gc

gc.set_debug(gc.DEBUG_STATS)


class GracefulExit(Exception):
    pass


def child(i):
    def exit(sig, frame):
        raise GracefulExit("{} out".format(i))

    signal.signal(signal.SIGTERM, exit)
    while True:
        time.sleep(1)


if __name__ == '__main__':
    workers = []

    d = {}
    for i in xrange(30000000):
        d[i] = i

    for i in range(5):
        pid = os.fork()
        if pid == 0:
            child(i)
        else:
            print pid
            workers.append(pid)

    while True:
        wpid, status = os.waitpid(-1, os.WNOHANG)
        if wpid:
            print wpid, status
        time.sleep(1)

The second script uses multiprocessing module. I'm running both on Linux (Ubuntu 14.04), so it should use os.fork under the hood too, as documentation states:

import multiprocessing
import time
import signal
import gc

gc.set_debug(gc.DEBUG_STATS)


class GracefulExit(Exception):
    pass


def child(i):
    def exit(sig, frame):
        raise GracefulExit("{} out".format(i))

    signal.signal(signal.SIGTERM, exit)
    while True:
        time.sleep(1)


if __name__ == '__main__':
    workers = []

    d = {}
    for i in xrange(30000000):
        d[i] = i

    for i in range(5):
        p = multiprocessing.Process(target=child, args=(i,))
        p.start()
        print p.pid
        workers.append(p)

    while True:
        for worker in workers:
            if not worker.is_alive():
                worker.join()
        time.sleep(1)

The difference between those two scripts is the following: when I kill a child (sending a SIGTERM), bare-fork script tries to garbagecollect the shared dictionary, despite the fact that it is still referenced by parent process and isn't actually copied into child's memory (because of copy-on-write)

kill <pid>

Traceback (most recent call last):
  File "test_mp_fork.py", line 33, in <module>
    child(i)
  File "test_mp_fork.py", line 19, in child
    time.sleep(1)
  File "test_mp_fork.py", line 15, in exit
    raise GracefulExit("{} out".format(i))
__main__.GracefulExit: 3 out
gc: collecting generation 2...
gc: objects in each generation: 521 3156 0
gc: done, 0.0024s elapsed.

(perf record -e page-faults -g -p <pid> output:)

+  99,64%  python  python2.7           [.] PyInt_ClearFreeList
+   0,15%  python  libc-2.19.so        [.] vfprintf
+   0,09%  python  python2.7           [.] 0x0000000000144e90
+   0,06%  python  libc-2.19.so        [.] strlen
+   0,05%  python  python2.7           [.] PyArg_ParseTupleAndKeywords
+   0,00%  python  python2.7           [.] PyEval_EvalFrameEx
+   0,00%  python  python2.7           [.] Py_AddPendingCall
+   0,00%  python  libpthread-2.19.so  [.] sem_trywait
+   0,00%  python  libpthread-2.19.so  [.] __errno_location

While multiprocessing-based script does no such thing:

kill <pid>

Process Process-3:
Traceback (most recent call last):
  File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "test_mp.py", line 19, in child
    time.sleep(1)
  File "test_mp.py", line 15, in exit
    raise GracefulExit("{} out".format(i))
GracefulExit: 2 out

(perf record -e page-faults -g -p <pid> output:)

+  62,96%  python  python2.7           [.] 0x0000000000047a5b
+  32,28%  python  python2.7           [.] PyString_Format
+   2,65%  python  python2.7           [.] Py_BuildValue
+   1,06%  python  python2.7           [.] PyEval_GetFrame
+   0,53%  python  python2.7           [.] Py_AddPendingCall
+   0,53%  python  libpthread-2.19.so  [.] sem_trywait

I can also force the same behavior on multiprocessing-based script by explicitly calling gc.collect() before raising GracefulExit. Curiously enough, the reverse is not true: calling gc.disable(); gc.set_threshold(0) in bare-fork script doesn't help to get rid of PyInt_ClearFreeList calls.

To the actual questions:

Why is this happening? I sort of understand why python would like to free all the allocated memory on process exit, ignoring the fact that the child process doesn't physically own it, but how come multiprocessing module doesn't do the same?
I'd like to achieve second-script-like behavior (i.e.: not trying to free the memory which has been allocated by a parent process) with bare-fork solution (mainly because I use a third-party process manager library which doesn't use multiprocessing); how could I possibly do that?

score 0 · Answer 1 · answered Mar 29 '17 at 12:14

0

Couple things

In python, multiple python processes means multiple interpreters with their own GIL, GC et al
The d dictionary is not passed in as an argument to the process, it is a globally shared variable.

The reason it gets collected is because each process thinks its the only one holding a reference to it which, strictly speaking, is true as it's a single globally shared object reference to the dictionary.

When Python GC checks it, it checks the ref counter for that object. Since there is only the one shared reference, removing that would mean ref count == 0, so it gets collected.

To resolve the issue, d should be passed into each forked process, making each process hold its own reference to it.

answered Mar 29 '17 at 12:14

danny

5,140
1
19
31

This doesn't explain multiprocessing-based script *not* performing garbage collection, though. – rocknrollnerd Mar 29 '17 at 12:40
Yes, it does. `d` is a shared variable in the forked process by `os.fork` while it does not exist in the process spawned by `multiprocessing` and therefore cannot be collected. – danny Mar 29 '17 at 12:47
This isn't true, because multiprocessing performs `os.fork` on Unix. You can add a line to the child's code, just after `time.sleep(1)`, something like `print d[0]` and see for yourself that `d` is accessible. – rocknrollnerd Mar 29 '17 at 12:52
Not quite. You can see exactly what `multiprocessing` is doing in `/multiprocessing/forking.py`. It does use `os.fork`, however it does more things before it forks. See `prepare` function. The reference to `d` in the process spawned by multiprocessing is not the same reference to `d` in the main process, unlike the bare fork. – danny Mar 29 '17 at 14:16
I've added a debug print in `prepare` and it isn't even being called. `Process.start` calls `multiprocessing.forking.Popen.__init__`, which does the forking, then calls `process_instance._bootstrap`. There are a couple of extra calls here, bit I cannot see anything related to "the reference to `d` is not the same" you mention, can you point it out for me? Also, if I add `print id(d)` in both parent and children, it prints the same id. Doesn't it mean the reference is the same? – rocknrollnerd Mar 29 '17 at 14:41
I guess I found it: it's `os._exit` call which multiprocessing does in `forking.Popen`. If I replace it with `sys.exit`, it starts behaving the way first script does. – rocknrollnerd Mar 29 '17 at 15:05

score 0 · Accepted Answer · edited May 23 '17 at 12:17

0

Multiprocessing behaves differently because it uses os._exit which doesn't call exit handler, which, apparently, involves garbage collection (more on the topic). Explicitly calling os._exit in bare-fork version of the script achieves the same result.

edited May 23 '17 at 12:17

Community

1
1

answered Mar 29 '17 at 15:42

rocknrollnerd

2,564
1
16
20

Garbage collector tries to collect shared memory object

2 Answers2