7

I'm having some trouble trying to implement a new defaultdict proxy object. The documentation is a bit scares, so I'm not sure how to go about this correctly.

I want to add a defaultdict to the list of types that are available from the Manager instance. You cannot use the Manager.register method on the stock multiprocessing.Manager so I've made my own stub Manager from the multiprocessing.mangers.BaseManager

class Manager(BaseManager):
    pass

I then created my subclass of multiprocessing.managers.BaseProxy to house the defaultdict (I did initially try having anther stub which would subclass both defaultdict and BaseProxy but that didn't seem to work. Here's what I currently have:

class ProxyDefaultDict(BaseProxy):
    def __init__(self, default_factory=None, *args, **kwargs):
        self.__data = defaultdict(default_factory)
        super().__init__(*args, **kwargs)

    def _callmethod(self, methodname, args=(), kwds={}):
        return getattr(self.__data, methodname)(*args, **kwds)

    def _getvalue(self):
        return self.__data.copy()

    def __repr__(self):
        return self.__data.__repr__()

    def __str__(self):
        return self.__data.__str__()

Manager.register('defaultdict', ProxyDefaultDict)

The end goal is to have a shared dictionary which safely shares keyed Locks across processes and threads. Here's an example of how I image it would be initialised:

if __name__ == '__main__':
    m = Manager()
    d = m.defaultdict(m.Lock)
    with d['named_lock']:
        print('holding the lock')

However, I've hit a few problems:

  1. A subclass of BaseManager seems to be only initalizable via a context manager i.e. with Manager() as m. Which I would use m = Manager() in this case - as the multiprocessing.Manager allows. Not the end of the world but more curious why this is the case and if it's a sign I'm doing something incorrectly.

  2. Subclassing multiprocessing.managers.BaseManager also menas you loose the default registered values from multiprocessing.Manager. In this case I need to re-register a ProxyLock for my manager (which I'm also unsure of the expected way to do this). Is it safe to just subclass multiprocessing.Manager directly.

  3. Finally, my ProxyDefaultDict doesn't seem to allow my to cleanly override its __init__. And I'm weary of not calling the BaseProxy.__init__ when subclassing. The problem is that BaseProxy also accepts positional arguments. I guess the way round this is to make the default_factory argument a keyed argument only, but that changes the expected interface to defaultdict and makes me assume I'm doing something incorrectly here again. The other types like Manager.Lock seem to be able to accept positional arguments.

Thanks for any help.

freebie
  • 2,161
  • 2
  • 19
  • 36

1 Answers1

6

After viewing the source code, a little modification of it works for me to get a defaultdict type proxy without issue (based on how the built in DictProxy is created).

from collections import defaultdict

from multiprocessing.managers import MakeProxyType, SyncManager

DefaultDictProxy = MakeProxyType("DefaultDictProxy", [
    '__contains__', '__delitem__', '__getitem__', '__len__',
    '__setitem__', 'clear', 'copy', 'default_factory', 'fromkeys',
    'get', 'items', 'keys', 'pop', 'popitem', 'setdefault',
    'update', 'values'])

SyncManager.register("defaultdict", defaultdict, DefaultDictProxy)
# Can also create your own Manager here, just using built in for simplicity

if __name__ == '__main__':
    with SyncManager() as sm:
        dd = sm.defaultdict(list)
        print(dd['a'])
        # []

Personally I find it handy that by using the tools already provided, don't even need to worry about how to subclass it yourself.

However, I don't think that will allow you to create the default locks scenario you are looking for. Multiprocessing locks are designed to be inherited only, and in general Locks cannot be pickled, which is a requirement for data types being transferred through the proxies. Example:

    from multiprocessing import Lock

    m = SyncManager()
    m.start()
    d = m.defaultdict(Lock)
    print(d['named_lock'])
    m.shutdown()

Will raise a runtime error:

RuntimeError: Lock objects should only be shared between processes through inheritance
CasualDemon
  • 5,790
  • 2
  • 21
  • 39
  • Ah yeah, looks like you can call `register` on `SyncManager` but not `Manager`, thanks. I get a different error to yourself when I try and use a lock as a default. When trying to use a managed lock e.g. `m.defaultdict(m.Lock)` I get a `TypeError: Pickling an AuthenticationString object is disallowed for security reasons` error. With a unmanaged lock I get `Unserializable message: ('#RETURN', )`. Running on Python 3.4. I would have thought if a Lock could be peroxided it could be pickled. – freebie Oct 10 '17 at 14:50
  • Yeah I think 3.6 just has an updated error / method of detection. I get that same error if I try using threading locks. There are some other ideas https://stackoverflow.com/questions/17960296/trouble-using-a-lock-with-multiprocessing-pool-pickling-error to get around that. – CasualDemon Oct 10 '17 at 17:44