1

I have a large Python 3.6 system where multiple processes and threads interact with each other and the user. Simplified, there is a Scheduler instance (subclasses threading.Thread) and a Worker instance (subclasses multiprocessing.Process). Both objects run for the entire duration of the program.

The user interacts with the Scheduler by adding Task instances and the Scheduler passes the task to the Worker at the correct moment in time. The worker uses the information contained in the task to do its thing.

Below is some stripped out and simplified code out of the project:

class Task:
    def __init__(self, name:str):
        self.name = name
        self.state = 'idle'
class Scheduler(threading.Thread):
    def __init__(self, worker:Worker):
        super().init()
        self.worker = worker
        self.start()

    def run(self):
        while True:
            # Do stuff until the user schedules a new task
            task = Task()  # <-- In reality the Task intance is not created here but the thread gets it from elsewhere
            task.state = 'scheduled'
            self.worker.change_task(task)

            # Do stuff until the task.state == 'finished'

class Worker(multiprocessing.Process):
    def __init__(self):
        super().init()
        self.current_task = None
        self.start()

    def change_task(self, new_task:Task):
        self.current_task = new_task
        self.current_task.state = 'accepted-idle'

    def run(self):
        while True:
            # Do stuff until the current task is updated
            self.current_task.state = 'accepted-running'
            # Task is running
            self.current_task.state = 'finished'

The system used to be structured so that the task contained multiple multiprocessing.Events indicating each of its possible states. Then, not the whole Task instance was passed to the worker, but each of the task's attributes was. As they were all multiprocessing safe, it worked, with a caveat. The events changed in worker.run had to be created in worker.run and back passed to the task object for it work. Not only is this a less than ideal solution, it no longer works with some changes I am making to the project.

Back to the current state of the project, as described by the python code above. As is, this will never work because nothing makes this multiprocessing safe at the moment. So I implemented a Proxy/BaseManager structure so that when a new Task is needed, the system gets it from the multiprocessing manager. I use this structure in a sightly different way elsewhere in the project as well. The issue is that the worker.run never knows that the self.current_task is updated, it remains None. I expected this to be fixed by using the proxy but clearly I am mistaken.

def Proxy(target: typing.Type) -> typing.Type:
    """
    Normally a Manager only exposes only object methods. A NamespaceProxy can be used when registering the object with
    the manager to expose all the attributes. This also works for attributes created at runtime.
    https://stackoverflow.com/a/68123850/8353475

    1. Instead of exposing all the attributes manually, we effectively override __getattr__ to do it dynamically.
    2. Instead of defining a class that subclasses NamespaceProxy for each specific object class that needs to be
    proxied, this method is used to do it dynamically. The target parameter should be the class of the object you want
    to generate the proxy for. The generated proxy class will be returned.
    Example usage: FooProxy = Proxy(Foo)

    :param target: The class of the object to build the proxy class for
    :return The generated proxy class
    """

    # __getattr__ is called when an attribute 'bar' is called from 'foo' and it is not found eg. 'foo.bar'. 'bar' can
    # be a class method as well as a variable. The call gets rerouted from the base object to this proxy, were it is
    # processed.
    def __getattr__(self, key):
        result = self._callmethod('__getattribute__', (key,))
        # If attr call was for a method we need some further processing
        if isinstance(result, types.MethodType):
            # A wrapper around the method that passes the arguments, actually calls the method and returns the result.
            # Note that at this point wrapper() does not get called, just defined.
            def wrapper(*args, **kwargs):
                # Call the method and pass the return value along
                return self._callmethod(key, args, kwargs)

            # Return the wrapper method (not the result, but the method itself)
            return wrapper
        else:
            # If the attr call was for a variable it can be returned as is
            return result

    dic = {'types': types, '__getattr__': __getattr__}
    proxy_name = target.__name__ + "Proxy"
    ProxyType = type(proxy_name, (NamespaceProxy,), dic)
    # This is a tuple of all the attributes that are/will be exposed. We copy all of them from the base class
    ProxyType._exposed_ = tuple(dir(target))
    return ProxyType


class TaskManager(BaseManager):
    pass


TaskProxy = Proxy(Task)
TaskManager.register('get_task', callable=Task, proxytype=TaskProxy)
jonrsharpe
  • 115,751
  • 26
  • 228
  • 437
Nxt-1
  • 11
  • 1

0 Answers0