2

I'm trying to use mulitprocessing.Pool to speed up the execution of a function across a range of inputs. The processes seem to have been called, since my task manager indicates a substantial increase in my CPU's utilization, but the task never terminates. No exceptions are ever raised, runtime or otherwise.

from multiprocessing import Pool

def f(x):
    print(x)
    return x**2

class Klass:
    def __init__(self):
        pass

    def foo(self):
        X = list(range(1, 1000))
        with Pool(15) as p:
            result = p.map(f, X)

if __name__ == "__main__":
    obj = Klass()
    obj.foo()
    print("All Done!")

Interestingly, despite the uptick in CPU utilization, print(x) never prints anything to the console.

I have moved the function f outside of the class as was suggested here, to no avail. I have tried adding p.close() and p.join() as well with no success. Using other Pool class methods like imap lead to TypeError: can't pickle _thread.lock objects errors and seems to take a step away from the example usage in the introduction of the Python Multiprocessing Documentation.

Adding to the confusion, if I try running the code above enough times (killing the hung kernel after each attempt) the code begins consistently working as expected. It usually takes about twenty attempts before this "clicks" into place. Restarting my IDE reverts the now functional code back to the former broken state. For reference, I am running using the Anaconda Python Distribution (Python 3.7) with the Spyder IDE on Windows 10. My CPU has 16 cores, so the Pool(15) is not calling for more processes than I have CPU cores. However, running the code with a different IDE, like Jupyter Lab, yields the same broken results.

Others have suggested that this may be a flaw with Spyder itself, but the suggestion to use mulitprocessing.Pool instead of mulitprocessing.Process doesn't seem to work either.

NolantheNerd
  • 338
  • 3
  • 10

2 Answers2

1

This seems like it might be a problem with both Spyder and Jupyter. If you run the above code in the console directly, everything works as intended.

NolantheNerd
  • 338
  • 3
  • 10
1

Could be related to this from python doc:

Note Functionality within this package requires that the main module be importable by the children. This is covered in Programming guidelines however it is worth pointing out here. This means that some examples, such as the multiprocessing.pool.Pool examples will not work in the interactive interpreter.

and then this comment on their example:

If you try this it will actually output three full tracebacks interleaved in a semi-random fashion, and then you may have to stop the master process somehow.

UPDATE: The info found here seems to confirm that using the pool from an interactive interpreter will have varying success. This guidance is also shared...

...guidance [is] to always use functions/classes whose definitions are importable.

This is the solution outlined here and which works for me (every time) using your code.

jayveesea
  • 2,886
  • 13
  • 25
  • That may be it, but I don't seem to get any of the tracebacks that they mentioned. – NolantheNerd May 07 '20 at 17:42
  • where are you looking? for jupyter you need to look in the console you launched it from. I get "Can't get attribute 'f' on " when i run your code. – jayveesea May 07 '20 at 17:49
  • and i get the same message when i run the python example in jupyter. could also explain why it sometime will work (uses the 15 other cores). – jayveesea May 07 '20 at 18:04
  • I stand corrected. Inside of the console used to launch Jupyter, I get the same error as you: ```AttributeError: Can't get attribute 'f' on ```. Unfortunately, I cannot explain why, after killing the kernel several times, it starts working. I don't have to change any of the above code. It just spontaneously starts running as expected. – NolantheNerd May 07 '20 at 19:16
  • see also [here](https://stackoverflow.com/questions/41385708/multiprocessing-example-giving-attributeerror) – jayveesea May 07 '20 at 19:48