0

I test the gc behavior that python perform after start process using multiprocess:

from multiprocessing import Process
import time

class A(object):
    def __del__(self):
        print 'deleting'

def f(name):
    import gc
    gc.collect()
    print 'hello', name
    print [map(lambda s: str(s)[:64], gc.get_referrers(o)) for o in gc.get_objects() if isinstance(o, A)]
    time.sleep(123)

def main():
    a=A()
    p = Process(target=f, args=('bob',))
    p.start()
    p.join()


if __name__ == '__main__':
    try:
        main()
    except:
        print 'sdfsdf!'

Output:

hello bob
[["[[], {'__setattr__': <slot wrapper '__setattr__' of 'object' obj", '<frame object at 0xb87570>', '<frame object at 0xbd7f80>']]

I want to close file descriptor by executing __del__. When the subprocess starts, it enters the f function and the A instance a would no longer be reachable. But the __del__ is not executed so that means the a object is still not freed. The output shows that it seems to be held by the frame object.

So I tried another way using Exception to clean the stack to try to free the unreachable object and execute __del__ function:

from multiprocessing import Process
import time
import sys

class GcHelp(Exception):
    def __init__(self, func):
        self.func = func
        super(GcHelp, self).__init__(func.__name__)

class A(object):
    def __del__(self):
        print 'deleting'

def f():
    print 'target function'


def raiser():
    raise GcHelp(f)

def main():
    a=A()
    p = Process(target=raiser, args=())
    p.start()
    p.join()


if __name__ == '__main__':
    try:
        main()
    except GcHelp as e:
        sys.exc_clear()
        e.func()
    except:
        print 'sdfsdf!' 

Output:

Process Process-1:
Traceback (most recent call last):
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "gc1.py", line 19, in raiser
    raise GcHelp(f)
GcHelp: f

It seems that the multiprocess have ready clean the stack and take over all exception handling.But parent frame does not exist any more. But why the frame is still there in the first code example? Obviously it still holding the a and the object is not freed at all.

Is there some way to perform this kind of gc in python?

Thanks a lot.

user2828102
  • 125
  • 1
  • 12
  • 1
    The `a` object is still accessible, after the function returns. Protip: Don't [*ever*](http://stackoverflow.com/questions/28300946/raii-in-python-whats-the-point-of-del/28302755#28302755) rely on the garbage collector. Just use `with` to close the file properly. – Kevin Apr 09 '16 at 04:30
  • @Kevin But in the subprocess, when the function returns, the process would end. So the `a` is not reachable in subprocess for its whole life. Does it? I cannot use `with` because the fd will be used in everywhere not only in one code block. – user2828102 Apr 09 '16 at 04:37
  • There is no `a` in the subprocess. – Kevin Apr 09 '16 at 05:33
  • @Kevin See the first code example. In the subprocess the code do find the `a` through `gc.get_objects()` – user2828102 Apr 09 '16 at 05:38
  • @Kevin There is an `a` in the child... at least in unixy systems. `fork` creates a copy-on-write view of the parent memory space and all of the parent's data at the time of the fork is there. If there is a future `execv` to run another process, it is destroyed then, but unix/linux/ios multiprocessing just does the fork. Its a different deal on Windows which doesn't have the concept of `fork`. – tdelaney Apr 09 '16 at 05:46

1 Answers1

0

Why would you want to close the file? Assuming this is a linuxy system, a forked environment is, well, forking weird. If you close the file in the child, it will flush any data still in the buffers... but that same data will be flushed again in the parent, resulting in duplicated data.

import multiprocessing as mp

fd = None

def worker():
    # child closes file, flushing "1"
    fd.close()

def doit():
    global fd
    fd = open('deleteme', 'w')
    fd.write('1')
    p = mp.Process(target=worker)
    p.start()
    p.join()
    # parent closes file, flushing "1"
    fd.close()
    # lets see what we got
    print(open('deleteme').read())

doit()

This script prints 11 because the child and parent file objects both wrote 1. It gets far crazier if either side calls flush or seek.

"it enters the f function and the A instance a would no longer be reachable." That's not true in general. First, the only reason that the child worker function exits the process when it returns is that the multiprocessing module called it from a function that ensures exit. In the general case, a forked function can return and execute its parent code. So, from python's perspective, that a is still reachable. Further, your worker could call a function that itself touches a. There is no way for python to know that.

It would be incredibly expensive for python to clean up all "unreferencable" objects after a fork. And incredibly dangerous as those objects can change the system in many ways.

tdelaney
  • 73,364
  • 6
  • 83
  • 116
  • Thanks. The buffer thing is indeed a problem that I totally missed. So, on the another hand, there is no way to release such useless memory in python, right? And the multiprocess module help me clean the stack and I could not catch GcHelp exception any more. – user2828102 Apr 09 '16 at 05:54
  • This is a linux process thing, not necessarily python in particular. The forked process has a copy-on-write view of the parent's memory. This is fast to setup because no memory is copied unless the child writes to it. There isn't any useless memory to clean up because it is quite literally the parent's memory. As for catching exceptions in the forked process... you'd need to add your own try/except and a queue to pass that info back to the parent. – tdelaney Apr 09 '16 at 06:00
  • Yeah, the exception is not for message passing, It's for cleaning the call stack. Some program like android use this method for this purpose. May be I should use `os.fork` directly to do this. – user2828102 Apr 09 '16 at 06:06
  • The exception doesn't clean any more of the call stack than returning from the worker does. The `multiprocessing` executor that calls the function also catches the exception. I'm not sure what your goals are but the process worker should not try to clean up anything it didn't create. – tdelaney Apr 09 '16 at 06:36