4

This code runs fine under regular CPython 3.5:

import concurrent.futures

def job(text):
    print(text)

with concurrent.futures.ProcessPoolExecutor(1) as pool:
    pool.submit(job, "hello")

But if you run it as python -m doctest myfile.py, it hangs. Changing submit(job to submit(print makes it not hang, as does using ThreadPoolExecutor instead of ProcessPoolExecutor.

Why does it hang when run under doctest?

John Zwinck
  • 239,568
  • 38
  • 324
  • 436

4 Answers4

8

So I think the issue is because of your with statement. When you have below

with concurrent.futures.ProcessPoolExecutor(1) as pool:
    pool.submit(job, "hello")

It enforces the thread to be executed and closed then an there itself. When you run this as main process it works and gives time for thread to execute the job. But when you import it as a module then it doesn't give the background thread a chance and the shutdown on the pool waits for the work to be executed and hence a deadlock

So the workaround that you can use is below

import concurrent.futures

def job(text):
    print(text)

pool = concurrent.futures.ProcessPoolExecutor(1)
pool.submit(job, "hello")

if __name__ == "__main__":
    pool.shutdown(True)

This will prevent the deadlock and will let you run doctest as well as import the module if you want

Tarun Lalwani
  • 142,312
  • 9
  • 204
  • 265
  • 2
    This answer is a little misleading, because the problem is not with the `with` statement. You can reproduce this behaviour without the `with` statement by doing `pool = ...ProcessPoolExecutor()` `pool.submit(...)` `pool.shutdown()`. The problem is the import lock, as I note in my answer. – daphtdazz Apr 18 '18 at 09:44
  • 1
    @daphtdazz, I do agree with you. I was not aware of `https://docs.python.org/3/library/imp.html#imp.lock_held` to quote that in my answer, I just knew it is a import deadlock. When I said the `with` statement is the issue, I meant that the `__exit__` of the `ProcessPoolExecutor` will execute the `shutdown` method and cause the deadlock with import. Your answer explains one layer below mine. Both are correct in their own context. You explained why it doesn't work and I explained how to make it work. – Tarun Lalwani Apr 18 '18 at 09:49
7

The problem is that importing a module acquires a lock (which lock depends on your python version), see the docs for imp.lock_held.

Locks are shared over multiprocessing so your deadlock occurs because your main process, while it is importing your module, loads and waits for a subprocess which attempts to import your module, but can't acquire the lock to import it because it is currently being imported by your main process.

In step form:

  1. Main process acquires lock to import myfile.py
  2. Main process starts importing myfile.py (it has to import myfile.py because that is where your job() function is defined, which is why it didn't deadlock for print()).
  3. Main process starts and blocks on subprocess.
  4. Subprocess tries to acquire lock to import myfile.py

=> Deadlock.

daphtdazz
  • 7,754
  • 34
  • 54
0

doctest imports your module in order to process it. Try adding this to prevent execution on import:

if __name__ == "__main__":
    with concurrent.futures.ProcessPoolExecutor(1) as pool: 
        pool.submit(job, "hello")
Udi
  • 29,222
  • 9
  • 96
  • 129
  • That sidesteps the problem by preventing the code from running all together. But I don't want to prevent the code from running, I want to prevent it from hanging. – John Zwinck Jan 12 '18 at 06:59
  • The code should run when the module is loaded (e.g. by doctest, or regular import), or run as a standalone script. – John Zwinck Jan 12 '18 at 07:11
0

This should actually be a comment, but it's too long to be one.

Your code fails if it's imported as a module too, with the same error as doctest. I get _pickle.PicklingError: Can't pickle <function job at 0x7f28cb0d2378>: import of module 'a' failed (I named the file as a.py).

Your lack of if __name__ == "__main__": violates the programming guidelines for multiprocessing: https://docs.python.org/3.6/library/multiprocessing.html#the-spawn-and-forkserver-start-methods

I guess that the child processes will also try to import the module, which then tries to start another child process (because the pool unconditionally executes). But I'm not 100% sure about this. I'm also not sure why the error you get is can't pickle <function>.

The issue here seems to be that you want the module to auto start a process on import. I'm not sure if this is possible.

Eric
  • 5,686
  • 2
  • 23
  • 36
  • I see what you're saying. Still, the problem is that I want to be able to launch a ProcessPoolExecutor within a doctest. That is what I can't get to work. Simply hiding all the code under `if name == "main"` doesn't work, because that prevents the code from ever running (under doctest). – John Zwinck Apr 12 '18 at 01:37
  • Why not put the code for the ProcessPoolExecutor in the doctest string so it runs it as a test? Or is there some other use case? – Eric Apr 12 '18 at 08:41