Is there a way to control how pytest-xdist runs tests in parallel?

Question

I have the following directory layout:

runner.py
lib/
tests/
      testsuite1/
                 testsuite1.py
      testsuite2/
                 testsuite2.py
      testsuite3/
                 testsuite3.py
      testsuite4/
                 testsuite4.py

The format of testsuite*.py modules is as follows:

import pytest 
class testsomething:
      def setup_class(self):
          ''' do some setup '''
          # Do some setup stuff here      
      def teardown_class(self):
          '''' do some teardown'''
          # Do some teardown stuff here

      def test1(self):
          # Do some test1 related stuff

      def test2(self):
          # Do some test2 related stuff

      ....
      ....
      ....
      def test40(self):
          # Do some test40 related stuff

if __name__=='__main()__'
   pytest.main(args=[os.path.abspath(__file__)])

The problem I have is that I would like to execute the 'testsuites' in parallel i.e. I want testsuite1, testsuite2, testsuite3 and testsuite4 to start execution in parallel but individual tests within the testsuites need to be executed serially.

When I use the 'xdist' plugin from py.test and kick off the tests using 'py.test -n 4', py.test is gathering all the tests and randomly load balancing the tests among 4 workers. This leads to the 'setup_class' method to be executed every time of each test within a 'testsuitex.py' module (which defeats my purpose. I want setup_class to be executed only once per class and tests executed serially there after).

Essentially what I want the execution to look like is:

worker1: executes all tests in testsuite1.py serially
worker2: executes all tests in testsuite2.py serially
worker3: executes all tests in testsuite3.py serially
worker4: executes all tests in testsuite4.py serially

while worker1, worker2, worker3 and worker4 are all executed in parallel.

Is there a way to achieve this in 'pytest-xidst' framework?

The only option that I can think of is to kick off different processes to execute each test suite individually within runner.py:


def test_execute_func(testsuite_path):
    subprocess.process('py.test %s' % testsuite_path)

if __name__=='__main__':
   #Gather all the testsuite names
   for each testsuite:
       multiprocessing.Process(test_execute_func,(testsuite_path,))

score 32 · Answer 1 · answered Jun 06 '19 at 09:59

You can use --dist=loadscope to group all the tests in the same test class. Here is the doc from pytest-xdist on pypi

By default, the -n option will send pending tests to any worker that is available, without any guaranteed order, but you can control this with these options:

--dist=loadscope: tests will be grouped by module for test functions and by class for test methods, then each group will be sent to an available worker, guaranteeing that all tests in a group run in the same process. This can be useful if you have expensive module-level or class-level fixtures. Currently the groupings can’t be customized, with grouping by class takes priority over grouping by module. This feature was added in version 1.19.

--dist=loadfile: tests will be grouped by file name, and then will be sent to an available worker, guaranteeing that all tests in a group run in the same worker. This feature was added in version 1.21.

Andriy Ivaneyko · Answer 2 · 2021-02-08T16:22:45.723

24

Multiple Options Available

Yes, there are such ways, the available options per xdist version 1.28.0 is the following ones:

--dist=each: Sends all the tests to all the nodes, so each test is run on every node.
--dist=load: Distributes the tests collected across all nodes so each test is run just once. All nodes collect and submit the test suite and when all collections (of test suites) are received it is verified they are identical collections. Then the collection gets divided up in chunks and chunks get submitted to nodes for execution.
--dist=loadscope Distributes the tests collected across all nodes so each test is run just once. All nodes collect and submit the list of tests and when all collections are received it is verified they are identical collections. Then the collection gets divided up in work units, grouped by test scope, and those work units get submitted to nodes.
--dist=loadfile Distributes the tests collected across all nodes so each test is run just once. All nodes collect and submit the list of tests and when all collections are received it is verified they are identical collections. Then the collection gets divided up in work units, grouped by test file, and those work units get submitted to nodes.

If you need any further information I recommend you to go straightly into the actual implementation of the schedulers and checkout how the distribution is being done.

edited Feb 08 '21 at 16:22

answered Jun 14 '19 at 09:25

Andriy Ivaneyko

20,639
6
60
82

I was wondering how does `-n` and ` --forked` or `--boxed` options relate to the the `--dist` strategies. And that it's weird that when using `--dist` you also need the `--tx` option. – CMCDragonkai Sep 02 '20 at 04:38
IIRC It impacts the count of processes on the workers for tests execution, the --dist identify strategy for splitting test suit and sending test cases for execution to workers and tx provides way to access the workers, so there are no much sense to apply any distribution strategy if no workers specified, (actually it's debatable if we will go into details of fixtures). That's my tractation of the intend that args designed that way :) – Andriy Ivaneyko Sep 02 '20 at 09:48
But `-n` creates workers doesn't it? I suspect that it leads to a multiprocessing pool. – CMCDragonkai Sep 03 '20 at 05:38
yeah, seems controversial ( with --tx you can provided processing to me as well, i think deep dive into codebase can answer that question. I believe worker is dedicated machine whereas -n is just way to parallel tests locally and not considered to be worker – Andriy Ivaneyko Sep 03 '20 at 08:55
Question: loadfile still starts several processes even if there is only one file. Do you know if there is a way to prevent it? It slows down single unittest development :) – Roelant May 04 '23 at 09:24
@Roelant don't know how to address this, personally, I am debugging and developing with xdist disabled, targetting tests against single file which I am working with. – Andriy Ivaneyko May 04 '23 at 20:37

score 15 · Answer 3 · answered Jan 09 '11 at 13:46

15

With pytest-xdist there currently no kind of "per-file" or "per-test-suite" distribution. Actually, if a per-file distribution (e.g. tests in a file will be only executed by at most one worker at a time) would already help your use case i encourage you to file a feature issue with the pytest issue tracker at https://bitbucket.org/hpk42/pytest/issues?status=new&status=open and link back to your good explanation here.

cheers,holger

answered Jan 09 '11 at 13:46

hpk42

21,501
4
47
53

8

For anyone viewing this post and wondering if any issues have been opened, the issues are: https://github.com/pytest-dev/pytest/issues/175 and https://github.com/pytest-dev/pytest/issues/738 – Chris Clark Oct 06 '15 at 15:55

score 3 · Answer 4 · answered Apr 25 '20 at 02:27

pytest_mproc (https://nak.github.io/pytest_mproc_docs/index.html) provides a "group" decorator that allows you to group tests together for execution serially. It also has faster startup time when working on large numbers of cores than xdist and provides a "global" scope test fixture.

score 2 · Answer 5 · answered Mar 03 '20 at 10:52

Having test suites in directory like in the question, you can run them in parallel it via:

pytest -n=$(ls **/test*py | wc -l) --dist=loadfile

If you have your tests suite files in single directory then just

pytest -n=$(ls test*py | wc -l) --dist=loadfile

In case new suite file occurs, this will include new test file automatically and will add additional worker for it

score 1 · Answer 6 · answered Aug 04 '21 at 23:05

It may also be worth noting that this issue on the pytest-xdist github addresses the use case I think you're asking about here.

Since the issue hasn't yet been resolved (despite being opened in 2015), I just monkey-patched a workable solution for my specific case. In my conftest.py, I added this chunk of code:

import xdist.scheduler as xdist_scheduler
def _split_scope(self, nodeid):
    return nodeid.rsplit("[", 1)[-1]
xdist_scheduler.loadscope.LoadScopeScheduling._split_scope = _split_scope

It basically just overwrites the _split_scope function from xdist to split groups by the nodeid instead of by the filetype. It worked for me, but I can't guarentee robustness since we're monkey-patching some normally internal code; use at your own risk.

Is there a way to control how pytest-xdist runs tests in parallel?

6 Answers6

Multiple Options Available

Linked