testing python multiprocessing pool code with nose

Question

I am trying to write tests with nose that get set up with something calculated using multiprocessing.

I have this directory structure:

code/
    tests/
        tests.py

tests.py looks like this:

import multiprocessing as mp


def f(i):
    return i ** 2


pool = mp.Pool()
out = pool.map(f, range(10))


def test_pool():
    """Really simple test that relies on the output of pool.map.
    The actual tests are much more complicated, but this is all
    that is needed to produce the problem."""
    ref_out = map(f, range(10))
    assert out == ref_out

if __name__ == '__main__':
    test_pool()

Running from the code directory, python tests/tests.py passes.

nosetests tests/tests.py fails to complete. It starts up, but never gets through the call to pool.mapand just hangs.

Why is this and what is the simplest solution?

It is possible that `nose` is using some threading and/or logging when running tests. This *can* lead to deadlocks when mixed with multiprocessing on UNIX systems. This is not a problem with python implementation but with the `fork()` function itself, which only forks the current thread, see [this](http://stackoverflow.com/questions/6078712/is-it-safe-to-fork-from-within-a-thread/6079669#6079669) answer for a more detail explanation. — Bakuriu, Sep 05 '13 at 17:58
I believe the only(?) solution would be to mock the `multiprocessing` module. In fact I don't see what your example is testing. It is actually a unittest for the `multiprocessing.Pool.map` method, and not for the `f` function! — Bakuriu, Sep 05 '13 at 18:07
It is the minimal example that reproduces my error. I'm testing a load of other stuff that uses the result of the `pool.map` as input. — aaren, Sep 05 '13 at 18:24
Does it matter that you compute the `map` over multiple cores? If not then replace `pool.map` with a plain `map`. — Bakuriu, Sep 05 '13 at 18:51
Obviously that would solve it! However, that isn't an option here: treat the use of `pool.map` as a constraint on the problem. — aaren, Sep 05 '13 at 19:58

score 4 · Accepted Answer · answered Feb 04 '14 at 09:15

The problem is related to the fact that pool.map is called at the "global level". Normally you want to avoid that, because these statements will be executed even if your file is simply imported.

Nose has to import your module to be able to find your tests and later execute them, therefore I believe the problem happens while the import mechanism kicks in (I haven't spent time trying to find out the exact reason for this behaviour)

You should move your initialization code to a test fixture instead; Nose supports fixtures with the with_setup decorator. Here is one possibility (probably the simplest change while keeping pool and out as globals):

import multiprocessing as mp
from nose import with_setup

pool = None
out  = None

def f(i):
    return i ** 2

def setup_func():
    global pool
    global out
    pool = mp.Pool()
    out  = pool.map(f, range(10))

@with_setup(setup_func)
def test_pool():
    """Really simple test that relies on the output of pool.map.
    The actual tests are much more complicated, but this is all
    that is needed to produce the problem."""
    global out
    ref_out = map(f, range(10))
    assert out == ref_out

if __name__ == '__main__':
    test_pool()

Executing:

$ nosetests tests/tests.py
.
----------------------------------------------------------------------
Ran 1 test in 0.011s

OK

testing python multiprocessing pool code with nose

1 Answers1

Linked