1

I have a multiprocessing Lock, that I define as

import multiprocessing

lock1 = mp.Lock()

To share this lock among the different child processes I do:

def setup_process(lock1):
    global lock_1
    lock_1 = lock1

pool = mp.Pool(os.cpu_count() - 1,
               initializer=setup_process,
               initargs=[lock1])

Now I've noticed that if the processes call the following function, and the function is defined in the same python module (i.e., same file):

def test_func():
    print("lock_1:", lock_1)
    with lock_1:
        print(str(mp.current_process()) + " has the lock in test function.")

I get an output like:

lock_1 <Lock(owner=None)>
<ForkProcess name='ForkPoolWorker-1' parent=82414 started daemon> has the lock in test function.
lock_1 <Lock(owner=None)>
<ForkProcess name='ForkPoolWorker-2' parent=82414 started daemon> has the lock in test function.
lock_1 <Lock(owner=None)>
<ForkProcess name='ForkPoolWorker-3' parent=82414 started daemon> has the lock in test function.

However, if test_function is defined in a different file, the Lock is not recognized, and I get:

NameError:
name 'lock_1' is not defined

This seems to happen for every function, where the important distinction is whether the function is defined in this module or in another one. I'm sure I'm missing something very obvious with the global variables, but I'm new to this and I haven't been able to figure it out. How can I make the Locks be recognized everywhere?

Aaron
  • 10,133
  • 1
  • 24
  • 40
nabla
  • 235
  • 2
  • 11

1 Answers1

1

Well, I learned something new about python today: global isn't actually truly global. It only is accessible at the module scope.

There are a multitude of ways of sharing your lock with the module in order to allow it to be used, and the docs even suggest a "canonical" way of sharing globals between modules (though I don't feel it's the most appropriate for this situation). To me this situation illustrates one of the short fallings of using globals in the first place, though I have to admit in the specific case of multiprocessing.Pool initializers it seems to be the accepted or even intended use case to use globals to pass data to worker functions. It actually makes sense that globals can't cross module boundaries because that would make the separate module 100% dependent on being executed by a specific script, so it can't really be considered a separate independent library. Instead it could just be included in the same file. I recognize that may be at odds with splitting things up not to create re-usable libraries but simply just to organize code logically in shorter to read segments, but that's apparently a stylistic choice by the designers of python.

To solve your problem, at the end of the day, you are going to have to pass the lock to the other module as an argument, so you might as well make test_func recieve lock_1 as an argument. You may have found however that this will cause a RuntimeError: Lock objects should only be shared between processes through inheritance message, so what to do? Basically, I would keep your initializer, and put test_func in another function which is in the __main__ scope (and therefore has access to your global lock_1) which grabs the lock, and then passes it to the function. Unfortunately we can't use a nicer looking decorator or a wrapper function, because those return a function which only exists in a local scope, and can't be imported when using "spawn" as the start method.

from multiprocessing import Pool, Lock, current_process

def init(l):
    global lock_1
    lock_1 = l

def local_test_func(shared_lock):
    with shared_lock:
        print(f"{current_process()} has the lock in local_test_func")

def local_wrapper():
    global lock_1
    local_test_func(lock_1)

from mymodule import module_test_func #same as local_test_func basically...

def module_wrapper():
    global lock_1
    module_test_func(lock_1)

if __name__ == "__main__":
    l = Lock()
    with Pool(initializer=init, initargs=(l,)) as p:
        p.apply(local_wrapper)
        p.apply(module_wrapper)
Aaron
  • 10,133
  • 1
  • 24
  • 40
  • I'm accepting your answer, although I had already found a different solution. I found the use of globals very inelegant and a bit dirty, so I resorted to using a Manager and passing the variables as arguments. – nabla Mar 26 '21 at 18:45
  • @nabla there are lots of ways to do it, but I think the "answer" I was trying to stress is that "global can't be used across modules (even outside of multiprocessing)" I will admit my solution wasn't the most elegant, but it does work, and addresses the underlying problem. I'm glad you got yours working – Aaron Mar 26 '21 at 18:50
  • I didn't mean your solution wasn't elegant, but the use of globals in general through initargs ;) I arrived at this impasse because I was following other solutions that I found here in stack overflow, and I also had a different idea of what a global variable was (coming from C). – nabla Mar 26 '21 at 18:54