4

I would like to renew the question about multiprocessing inside decorator (my previous question seems to me dead :) ). I stumbled on this problem and unfortunately I have no idea how to solve this. For need of my application I have to use multiprocessing inside the decorator but... when I use multiprocessing inside decorator I get error: Can't pickle <function run_testcase at 0x00000000027789C8>: it's not found as __main__.run_testcase. In other hand, when I call my multiprocessing function like normal function wrapper(function,*arg) it works. This is very tricky, but I have no idea what am I doing wrong. I am close to conclude that this is python error :). Maybe someone knows the workaround to this problem leaving the same syntax. I run this code on Windows (unluckily).

The previous question: Using multiprocessing inside decorator generates error: can't pickle function...it's not found as

The simplest code to simulate this error:

from multiprocessing import Process,Event

class ExtProcess(Process):
    def __init__(self, event,*args,**kwargs):
        self.event=event
        Process.__init__(self,*args,**kwargs)

    def run(self):
        Process.run(self)
        self.event.set()

class PythonHelper(object):

    @staticmethod
    def run_in_parallel(*functions):
        event=Event()
        processes=dict()
        for function in functions:
            fname=function[0]
            try:fargs=function[1]
            except:fargs=list()
            try:fproc=function[2]
            except:fproc=1
            for i in range(fproc):
                process=ExtProcess(event,target=fname,args=fargs)
                process.start()
                processes[process.pid]=process
        event.wait()
        for process in processes.values():
            process.terminate()
        for process in processes.values():
            process.join()
class Recorder(object):
    def capture(self):
        while True:print("recording")
from z_helper import PythonHelper
from z_recorder import Recorder

def wrapper(fname,*args):
    try:
        PythonHelper.run_in_parallel([fname,args],[Recorder().capture])
        print("success")
    except Exception as e:
        print("failure: {}".format(e))
from z_wrapper import wrapper
from functools import wraps

class Report(object):
    @staticmethod
    def debug(fname):
        @wraps(fname)
        def function(*args):
            wrapper(fname,args)
        return function

executing:

from z_report import Report
import time

class Test(object):
    @Report.debug
    def print_x(self,x):
        for index,data in enumerate(range(x)):
            print(index,data); time.sleep(1)

if __name__=="__main__":
    Test().print_x(10)

I added @wraps to the previous version

My Traceback:

Traceback (most recent call last):
  File "C:\Interpreters\Python32\lib\pickle.py", line 679, in save_global
    klass = getattr(mod, name)
AttributeError: 'module' object has no attribute 'run_testcase'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\EskyTests\w_Logger.py", line 19, in <module>
    logger.run_logger()
  File "C:\EskyTests\w_Logger.py", line 14, in run_logger
    self.run_testcase()
  File "C:\EskyTests\w_Decorators.py", line 14, in wrapper
    PythonHelper.run_in_parallel([function,args],[recorder.capture])
  File "C:\EskyTests\w_PythonHelper.py", line 25, in run_in_parallel
    process.start()
  File "C:\Interpreters\Python32\lib\multiprocessing\process.py", line 130, in start
    self._popen = Popen(self)
  File "C:\Interpreters\Python32\lib\multiprocessing\forking.py", line 267, in __init__
    dump(process_obj, to_child, HIGHEST_PROTOCOL)
  File "C:\Interpreters\Python32\lib\multiprocessing\forking.py", line 190, in dump
    ForkingPickler(file, protocol).dump(obj)
  File "C:\Interpreters\Python32\lib\pickle.py", line 237, in dump
    self.save(obj)
  File "C:\Interpreters\Python32\lib\pickle.py", line 344, in save
    self.save_reduce(obj=obj, *rv)
  File "C:\Interpreters\Python32\lib\pickle.py", line 432, in save_reduce
    save(state)
  File "C:\Interpreters\Python32\lib\pickle.py", line 299, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Interpreters\Python32\lib\pickle.py", line 623, in save_dict
    self._batch_setitems(obj.items())
  File "C:\Interpreters\Python32\lib\pickle.py", line 656, in _batch_setitems
    save(v)
  File "C:\Interpreters\Python32\lib\pickle.py", line 299, in save
    f(self, obj) # Call unbound method with explicit self
  File "C:\Interpreters\Python32\lib\pickle.py", line 683, in save_global
    (obj, module, name))
_pickle.PicklingError: Can't pickle <function run_testcase at 0x00000000027725C8>: it's not found as __main__.run_testcase
Community
  • 1
  • 1
falek.marcin
  • 9,679
  • 9
  • 28
  • 33

1 Answers1

4

The multiprocessing module "invokes" functions in its slave processes by calling the pickler on them. This is because it has to send the name of the function through the IPC interfaces it creates to the slave processes. The pickler figures out the proper name to use and sends it through, and then on the other side the unpickler transforms the name back into the function.

When a function is a class member, it can't be pickled properly without help. It's worse for @staticmethod members, because they have type function rather than type instancemethod, which fools the pickler. You can see this pretty easily without using multiprocessing:

import pickle

class Klass(object):
    @staticmethod
    def func():
        print 'func()'
    def __init__(self):
        print 'Klass()'

obj = Klass()
obj.func()
print pickle.dumps(obj.func)

produces:

Klass()
func()
Traceback (most recent call last):
 ...
pickle.PicklingError: Can't pickle <function func at 0x8017e17d0>: it's not found as __main__.func

The problem is clearer when you try to pickle a regular, non-static-method like obj.__init__, as the pickler then realizes that it is indeed an instance-method:

TypeError: can't pickle instancemethod objects

All is not lost, however. You just need to add a level of indirection. You can provide an ordinary function that creates the instance binding in the target process, sending it at least two arguments: the (pickle-able) class instance and the name of the function. I also add any arguments to use when calling the function for completeness. You then invoke this ordinary function in the target process, and it invokes the class's member function:

def call_name(instance, name, *args = (), **kwargs = None):
    "helper function for multiprocessing: call instance.getattr(name)"
    if kwargs is None:
        kwargs = {}
    getattr(instance, name)(*args, **kwargs)

Now instead of (this is copied from your linked post):

PythonHelper.run_in_parallel([self.run_testcase],[recorder.capture])

you would do something like this (you may want to fuss around with the call sequence):

PythonHelper.run_in_parallel([call_name, (self, 'run_testcase')],
    [recorder.capture])

(note: this is all untested and may have various errors).


Update

I took the new code you posted and tried it out.

First I had to fix indentation in z_report.py (de-indent all of class Report).

Once that was done, running it gave a rather different error than the one you show:

Process ExtProcess-1:
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/tmp/t/marcin/z_helper.py", line 9, in run
    Process.run(self)
  File "/usr/local/lib/python2.7/multiprocessing/process.py", line 114, in run
recording
[infinite spew of "recording" messages]

To fix the endless "recording" messages:

diff --git a/z_recorder.py b/z_recorder.py
index 6163a87..a482268 100644
--- a/z_recorder.py
+++ b/z_recorder.py
@@ -1,4 +1,6 @@
+import time
 class Recorder(object):
     def capture(self):
-        while True:print("recording")
-
+        while True:
+            print("recording")
+            time.sleep(5)

That left the one remaining problem: wrong arguments to print_x:

TypeError: print_x() takes exactly 2 arguments (1 given)

Python was actually doing all the right stuff for you at this point, it's just that z_wrapper.wrapper is a bit overzealous:

diff --git a/z_wrapper.py b/z_wrapper.py
index a0c32bf..abb1299 100644
--- a/z_wrapper.py
+++ b/z_wrapper.py
@@ -1,7 +1,7 @@
 from z_helper import PythonHelper
 from z_recorder import Recorder

-def wrapper(fname,*args):
+def wrapper(fname,args):
     try:
         PythonHelper.run_in_parallel([fname,args],[Recorder().capture])
         print("success")

The problem here is that by the time you get to z_wrapper.wrapper, the function arguments have been all bundled up into a tuple. z_report.Report.debug already has:

    def function(*args):

so that the two arguments, in this case the instance of main.Test and the value 10, have been made into a tuple. You just want z_wrapper.wrapper to pass that (single) tuple to PythonHelper.run_in_parallel, to supply the arguments. If you add another *args that tuple gets wrapped into another tuple (of one element this time). (You can see this by adding a print "args:", args in z_wrapper.wrapper.)

torek
  • 448,244
  • 59
  • 642
  • 775
  • I can't add self argument inside (). In my example `PythonHelper.run_in_parallel([self.run_testcase],[recorder.capture])` works correcty, but using `PythonHelper.run_in_parallel([function,args],[recorder.capture])` inside dec. generates error. Maybe I don't understand your example but I can't use this in my code. – falek.marcin Apr 29 '12 at 09:57
  • I added an example in which I try to check the fixes and come up with something new – falek.marcin Apr 29 '12 at 10:54
  • This issue affects only this one function (multiprocessing). I should add that other static methods are working properly. – falek.marcin Apr 29 '12 at 13:19
  • thanks for your answer ant fixing my code, but it still doesn't solve problem with pickle error while I'm using multiprocessing inside decorator. – falek.marcin Apr 30 '12 at 17:23
  • I'll need something closer to the version-that-doesn't-work to fix it. :-) There is a way to get the pickler to pickle instance-methods; it's not perfect but it might suffice for your needs. I can't tell without code, though. – torek May 01 '12 at 03:14
  • source code that generates this error is from previous task: http://stackoverflow.com/questions/8126654/using-multiprocessing-inside-decorator-generates-error-cant-pickle-function, I made ​​some changes to it, so you can move it to the tests (all files are signed). 'w_Logger' file makes a call using the decorator inside and this is the main problem with pickle (@check). The only external library used in this code is PIL (you can easily add in the 'capture' method anything other - important that work in parallel). – falek.marcin May 01 '12 at 05:53
  • For the purpose of coming to the problem I created a simpler code that I added in this task (print_x method), but I was not able to reach any meaningful conclusions. The original code is the code from the previous task. If there is a way to get pickler I would be thankful for any hint :) I run this code od Windows XP/7 and with Python3.2 – falek.marcin May 01 '12 at 05:53
  • Hm, I don't have Windows at all, and I'd have to build and install Python3. Might be a while before I can get back to this... – torek May 01 '12 at 07:01
  • anyone can help with this problem? – falek.marcin May 06 '12 at 17:21