8

I do not know why, but I get this strange error whenever I try to pass to the method of a shared object shared custom class object. Python version: 3.6.3

Code:

from multiprocessing.managers import SyncManager

class MyManager(SyncManager): pass
class MyClass: pass

class Wrapper:
    def set(self, ent):
        self.ent = ent

MyManager.register('MyClass', MyClass)
MyManager.register('Wrapper', Wrapper)

if __name__ == '__main__':
    manager = MyManager()
    manager.start()

    try:
        obj = manager.MyClass()
        lst = manager.list([1,2,3])

        collection = manager.Wrapper()
        collection.set(lst) # executed fine
        collection.set(obj) # raises error
    except Exception as e:
        raise

Error:

---------------------------------------------------------------------------
Traceback (most recent call last):
  File "D:\Program Files\Python363\lib\multiprocessing\managers.py", line 228, in serve_client
    request = recv()
  File "D:\Program Files\Python363\lib\multiprocessing\connection.py", line 251, in recv
    return _ForkingPickler.loads(buf.getbuffer())
  File "D:\Program Files\Python363\lib\multiprocessing\managers.py", line 881, in RebuildProxy
    return func(token, serializer, incref=incref, **kwds)
TypeError: AutoProxy() got an unexpected keyword argument 'manager_owned'
---------------------------------------------------------------------------

What's the problem here?

Sergey Dylda
  • 399
  • 3
  • 12

4 Answers4

14

I ran into this too, as noted, this is a bug in Python multiprocessing (see issue #30256) and the pull request that corrects this has not yet been merged. The pull request has since been superseded by another PR that makes the same change but adds a test as well.

Apart from manually patching your local installation, you have three other options:

  • you could use the MakeProxyType() callable to specify your proxytype, without relying on the AutoProxy proxy generator,
  • you could define a custom proxy class,
  • you can patch the bug with a monkeypatch

I'll describe those options below, after explaining what AutoProxy does:

What's the point of the AutoProxy class

The multiprocessing Manager pattern gives access to shared values by putting the values all in the same, dedicated 'canonical values server' process. All other processes (clients) talk to the server through proxies that then pass messages back and forth with the server.

The server does need to know what methods are acceptable for the type of object, however, so clients can produce a proxy object with the same methods. This is what the AutoProxy object is for. Whenever a client needs a new instance of your registered class, the default proxy the client creates is an AutoProxy, which then asks the server to tell it what methods it can use.

Once it has the method names, it calls MakeProxyType to construct a new class and then creates an instance for that class to return.

All this is deferred until you actually need an instance of the proxied type, so in principle AutoProxy saves a little bit of memory if you are not using certain classes you have registered. It's very little memory, however, and the downside is that this process has to take place in each client process.

These proxy objects use reference counting to track when the server can remove the canonical value. It is that part that is broken in the AutoProxy callable; a new argument is passed to the proxy type to disable reference counting when the proxy object is being created in the server process rather than in a client but the AutoProxy type wasn't updated to support this.

So, how can you fix this? Here are those 3 options:

Use the MakeProxyType() callable

As mentioned, AutoProxy is really just a call (via the server) to get the public methods of the type, and a call to MakeProxyType(). You can just make these calls yourself, when registering.

So, instead of

from multiprocessing.managers import SyncManager
SyncManager.register("YourType", YourType)

use

from multiprocessing.managers import SyncManager, MakeProxyType, public_methods
#               arguments:    classname,  sequence of method names
YourTypeProxy = MakeProxyType("YourType", public_methods(YourType))
SyncManager.register("YourType", YourType, YourTypeProxy)

Feel free to inline the MakeProxyType() call there.

If you were using the exposed argument to SyncManager.register(), you should pass those names to MakeProxyType instead:

# SyncManager.register("YourType", YourType, exposed=("foo", "bar"))
# becomes
YourTypeProxy = MakeProxyType("YourType", ("foo", "bar"))
SyncManager.register("YourType", YourType, YourTypeProxy)

You'd have to do this for all the pre-registered types, too:

from multiprocessing.managers import SyncManager, AutoProxy, MakeProxyType, public_methods

registry = SyncManager._registry
for typeid, (callable, exposed, method_to_typeid, proxytype) in registry.items():
    if proxytype is not AutoProxy:
        continue
    create_method = hasattr(managers.SyncManager, typeid)
    if exposed is None:
        exposed = public_methods(callable) 
    SyncManager.register(
        typeid,
        callable=callable,
        exposed=exposed,
        method_to_typeid=method_to_typeid,
        proxytype=MakeProxyType(f"{typeid}Proxy", exposed),
        create_method=create_method,
    )

Create custom proxies

You could not rely on multiprocessing creating a proxy for you. You could just write your own. The proxy is used in all processes except for the special 'managed values' server process, and the proxy should pass messages back and forth. This is not an option for the already-registered types, of course, but I'm mentioning it here because for your own types this offers opportunities for optimisations.

Note that you should have methods for all interactions that need to go back to the 'canonical' value instance, so you'd need to use properties to handle normal attributes or add __getattr__, __setattr__ and __delattr__ methods as needed.

The advantage is that you can have very fine-grained control over what methods actually need to exchange data with the server process; in my specific example, my proxy class caches information that is immutable (the values would never change once the object was created), but were used often. That includes a flag value that controls if other methods would do something, so the proxy could just check the flag value and not talk to the server process if not set. Something like this:

class FooProxy(BaseProxy):
    # what methods the proxy is allowed to access through calls
    _exposed_ = ("__getattribute__", "expensive_method", "spam")

    @property
    def flag(self):
        try:
            v = self._flag
        except AttributeError:
            # ask for the value from the server, "realvalue.flag"
            # use __getattribute__ because it's an attribute, not a property
            v = self._flag = self._callmethod("__getattribute__", ("flag",))
        return flag

    def expensive_method(self, *args, **kwargs):
        if self.flag:   # cached locally!
            return self._callmethod("expensive_method", args, kwargs)

    def spam(self, *args, **kwargs):
        return self._callmethod("spam", args, kwargs)

SyncManager.register("Foo", Foo, FooProxy)

Because MakeProxyType() returns a BaseProxy subclass, you can combine that class with a custom subclass, saving yourself having to write any methods that just consist of return self._callmethod(...):

# a base class with the methods generated for us. The second argument
# doubles as the 'permitted' names, stored as _exposed_
FooProxyBase = MakeProxyType(
    "FooProxyBase",
    ("__getattribute__", "expensive_method", "spam"),
)

class FooProxy(FooProxyBase):
    @property
    def flag(self):
        try:
            v = self._flag
        except AttributeError:
            # ask for the value from the server, "realvalue.flag"
            # use __getattribute__ because it's an attribute, not a property
            v = self._flag = self._callmethod("__getattribute__", ("flag",))
        return flag

    def expensive_method(self, *args, **kwargs):
        if self.flag:   # cached locally!
            return self._callmethod("expensive_method", args, kwargs)

    def spam(self, *args, **kwargs):
        return self._callmethod("spam", args, kwargs

SyncManager.register("Foo", Foo, FooProxy)

Again, this won't solve the issue with standard types nested inside other proxied values.

Apply a monkeypatch

I use this to fix the AutoProxy callable, this should automatically avoid patching when you are running a Python version where the fix has already been applied to the source code:

# Backport of https://github.com/python/cpython/pull/4819
# Improvements to the Manager / proxied shared values code
# broke handling of proxied objects without a custom proxy type,
# as the AutoProxy function was not updated.
#
# This code adds a wrapper to AutoProxy if it is missing the
# new argument.

import logging
from inspect import signature
from functools import wraps
from multiprocessing import managers


logger = logging.getLogger(__name__)
orig_AutoProxy = managers.AutoProxy


@wraps(managers.AutoProxy)
def AutoProxy(*args, incref=True, manager_owned=False, **kwargs):
    # Create the autoproxy without the manager_owned flag, then
    # update the flag on the generated instance. If the manager_owned flag
    # is set, `incref` is disabled, so set it to False here for the same
    # result.
    autoproxy_incref = False if manager_owned else incref
    proxy = orig_AutoProxy(*args, incref=autoproxy_incref, **kwargs)
    proxy._owned_by_manager = manager_owned
    return proxy


def apply():
    if "manager_owned" in signature(managers.AutoProxy).parameters:
        return

    logger.debug("Patching multiprocessing.managers.AutoProxy to add manager_owned")
    managers.AutoProxy = AutoProxy

    # re-register any types already registered to SyncManager without a custom
    # proxy type, as otherwise these would all be using the old unpatched AutoProxy
    SyncManager = managers.SyncManager
    registry = managers.SyncManager._registry
    for typeid, (callable, exposed, method_to_typeid, proxytype) in registry.items():
        if proxytype is not orig_AutoProxy:
            continue
        create_method = hasattr(managers.SyncManager, typeid)
        SyncManager.register(
            typeid,
            callable=callable,
            exposed=exposed,
            method_to_typeid=method_to_typeid,
            create_method=create_method,
        )

Import the above and call the apply() function to fix multiprocessing. Do so before you start the manager server!

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • 3
    Impressive work on this answer. It's somewhat of an embarrassment that this issue has gone unfixed for years, especially as multiprocessing is becoming a more and more important part of Python due to its ability to circumvent the GIL. – user4815162342 Oct 12 '20 at 07:44
8

Solution editing multiprocessing source code

The original answer by Sergey requires you to edit multiprocessing source code as follows:

  1. Find your multiprocessing package (mine, installed via Anaconda, was in /anaconda3/lib/python3.6/multiprocessing).
  2. Open managers.py
  3. Add the key argument manager_owned=True to the AutoProxy function.

Original AutoProxy:

def AutoProxy(token, serializer, manager=None, authkey=None,
          exposed=None, incref=True):
    ...

Edited AutoProxy:

def AutoProxy(token, serializer, manager=None, authkey=None,
          exposed=None, incref=True, manager_owned=True):
    ...

Solution via code, at run time

I have managed to solve the unexpected keyword argument TypeError exception without editing directly the source code of multiprocessing by instead adding these few lines of code where I use multiprocessing's Managers:

import multiprocessing

# Backup original AutoProxy function
backup_autoproxy = multiprocessing.managers.AutoProxy

# Defining a new AutoProxy that handles unwanted key argument 'manager_owned'
def redefined_autoproxy(token, serializer, manager=None, authkey=None,
          exposed=None, incref=True, manager_owned=True):
    # Calling original AutoProxy without the unwanted key argument
    return backup_autoproxy(token, serializer, manager, authkey,
                     exposed, incref)

# Updating AutoProxy definition in multiprocessing.managers package
multiprocessing.managers.AutoProxy = redefined_autoproxy
Luca Cappelletti
  • 2,485
  • 20
  • 35
  • Which version of python does this work with? Under 3.6.6 my multiprocessing module doesn't have a managers attribute. $ python3 Python 3.6.6 (default, Aug 13 2018, 18:24:23) [GCC 4.8.5 20150623 (Red Hat 4.8.5-28)] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import multiprocessing >>> multiprocessing.managers Traceback (most recent call last): File "", line 1, in AttributeError: module 'multiprocessing' has no attribute 'managers' – razeh Oct 07 '19 at 20:58
  • I would say `3.6.4`, but I'm not sure. – Luca Cappelletti Oct 08 '19 at 05:50
  • This solution produced seg faults for me in Python 3.6.9. There is an open pull request (as of this writing) that solves the problem in the Python library. I've posted an alternate in-code solution to this using the pull request and not observed the same seg faults: https://stackoverflow.com/questions/56716470/multiprocessing-manager-nested-shared-objects-doesnt-work-with-queue – David Parks Nov 02 '19 at 03:53
  • @razeh: you need to import `multiprocessing.managers` directly. – Martijn Pieters Sep 08 '20 at 16:11
  • Your described edit and your `redefined_autoproxy()` function are wrong, I'm afraid. You should not just ignore the `manager_owned` flag here, and neither should it default to `True`. I have answered as well, with a patch that doesn't make this mistake. Basically, if `manage_owned` flag is set, you should set the `incref` flag to `False`, and correct the `_owned_by_manager` value on the instance that is returned. Also, there are several types already registered on the `SyncManager` class that need re-registering (otherwise pickle will complain, loudly). – Martijn Pieters Sep 08 '20 at 16:19
4

Found temporary solution here. I've managed to fix it by adding needed keyword to initializer of AutoProxy in multiprocessing\managers.py Though, I don't know if this kwarg is responsible for anything.

Sergey Dylda
  • 399
  • 3
  • 12
  • Yes, the extra argument is responsible for making sure proxy objects in the shared value server are deleted properly. Do make sure you pass the value on to the poxy type generated with MakeProxyType. Or handle that case explicitly as I do in my monkeypatch fix (see my answer). – Martijn Pieters Sep 08 '20 at 23:13
0

If someone encountered errors like this: PickleError: Can't pickle <class 'multiprocessing.managers.xxxx'>: attribute lookup xxxx on multiprocessing.managers failed after implementing the excellent solution from Martijn, you can try this patch:

Unlike the auto-generated AutoProxy instance, the proxy class created by MakeProxyType is not in the multiprocessing.managers namespace. So you need to add it to the namespace by setattr, like this:

import multiprocessing.managers as mms
from multiprocessing.managers import SyncManager, MakeProxyType, public_methods

TaskProxy = MakeProxyType('Task', public_methods(Task))
setattr(mms, 'Task', TaskProxy)
SyncManager.register('Task', Task, TaskProxy)

TaskProxy is the proxy class you create. You need to use setattr to add it to the multiprocessing.managers namespace. Then it should work.

Kyle Wang
  • 41
  • 4