18

I am trying to use Pytest to test a largish (~100k LOC, 1k files) project and there are several other similar projects for which I'd like to do the same, eventually. This is not a standard Python package; it's part of a heavily customized system that I have little power to change, at least in the near term. Test modules are integrated with the code, rather than being in a separate directory, and that is important to us. The configuration is fairly similar to this question, and my answer there that might provide helpful background, too.

The issue I'm having is that the projects use PEP 420 implicit namespace packages almost exclusively; that is, there are almost no __init__.py files in any of the package directories. I haven't seen any cases yet where the packages had to be namespace packages, but given that this project is combined with other projects that also have Python code, this could happen (or already be happening and I've just not noticed it).

Consider a repository that looks like the following. (For a runnable copy of it, including the tests described below, clone 0cjs/pytest-impl-ns-pkg from GitHub.) All tests below are assumed to be in project/thing/thing_test.py.

repo/
    project/
        util/
            thing.py
            thing_test.py

I have enough control over the testing configurations that I can ensure sys.path is set appropriately for imports of the code under test to work properly. That is, the following test will pass:

def test_good_import():
    import project.util.thing

However, Pytest is determining package names from files using its usual system, giving package names that are not the standard ones for my configuration and adding subdirectories of my project to sys.path. So the following two tests fail:

def test_modulename():
    assert 'project.util.thing_test' == __name__
    # Result: AssertionError: assert 'project.util.thing_test' == 'thing_test'

def test_bad_import():
    ''' While we have a `project.util.thing` deep in our hierarchy, we do
        not have a top-level `thing` module, so this import should fail.
    '''
    with raises(ImportError):
        import thing
    # Result: Failed: DID NOT RAISE <class 'ImportError'>

As you can see, while thing.py can always be imported as project.util.thing, thing_test.py is project.util.thing_test outside of Pytest, but in a Pytest run project/util is added to sys.path and the module is named thing_test.

This introduces a number of problems:

  1. Module namespace collisions (e.g., between project/util/thing_test.py and project/otherstuff/thing_test.py).
  2. Bad import statements not being caught because the code under test is also using these non-production import paths.
  3. Relative imports may not work in test code because the module has been "moved" in the hierarchy.
  4. In general I'm quite nervous about having a large number of extra paths added to sys.path in testing that will be absent in production as I see a lot of potential for errors in this. But let's call that the first (and at the moment, I guess, default) option.

What I think I would like to be able to do would be to tell Pytest that it should determine module names relative to specific filesystem paths that I provide, rather than itself deciding what paths to used based on presence and absence of __init__.py files. However, I see no way to do this with Pytest. (It's not out of the question for me to add this to Pytest, but that also won't happen in the near future as I think I'd want a much deeper understanding of Pytest before even proposing exactly how to do this.)

A third option (after just living with the current situation and changing pytest as above) is simply to add dozens of __init__.py files to the project. However, while using extend_path in them would (I think) deal with the namespace vs. regular package issue in the normal Python world, I think it would break our unusual release system for packages declared in multiple projects. (That is, if another project had a project.util.other module and was combined for release with our project, the collision between their project/util/__init__.py and our project/util/__init__.py would be a major problem.) Fixing this would be a major challenge since we'd have to, among other things, add some way to declare that some directories containing an __init__.py are actually namespace packages.

Are there ways to improve the above options? Are there other options I'm missing?

Community
  • 1
  • 1
cjs
  • 25,752
  • 9
  • 89
  • 101

2 Answers2

8

The issue you are facing is that you place tests aside the production code inside namespace packages. As stated here, pytest recognizes your setup as standalone test modules:

Standalone test modules / conftest.py files

...

pytest will find foo/bar/tests/test_foo.py and realize it is NOT part of a package given that there’s no __init__.py file in the same folder. It will then add root/foo/bar/tests to sys.path in order to import test_foo.py as the module test_foo. The same is done with the conftest.py file by adding root/foo to sys.path to import it as conftest.

So the proper way to solve (at least part of) this would be to adjust the sys.path and separate tests from production code, e.g. moving test module thing_test.py into a separate directory project/util/tests. Since you can't do that, you have no choice but to mess with pytest's internals (as you won't be able to override the module import behaviour via hooks). Here's a proposal: create a repo/conftest.py with the patched LocalPath class:

# repo/conftest.py

import pathlib
import py._path.local


# the original pypkgpath method can't deal with namespace packages,
# considering only dirs with __init__.py as packages
pypkgpath_orig = py._path.local.LocalPath.pypkgpath

# we consider all dirs in repo/ to be namespace packages
rootdir = pathlib.Path(__file__).parent.resolve()
namespace_pkg_dirs = [str(d) for d in rootdir.iterdir() if d.is_dir()]

# patched method
def pypkgpath(self):
    # call original lookup
    pkgpath = pypkgpath_orig(self)
    if pkgpath is not None:
        return pkgpath
    # original lookup failed, check if we are subdir of a namespace package
    # if yes, return the namespace package we belong to
    for parent in self.parts(reverse=True):
        if str(parent) in namespace_pkg_dirs:
            return parent
    return None

# apply patch
py._path.local.LocalPath.pypkgpath = pypkgpath

pytest>=6.0

Version 6 removes usages of py.path so the monkeypatching should be applied to _pytest.pathlib.resolve_package_path instead of LocalPath.pypkgpath, but the rest is essentially the same:

# repo/conftest.py

import pathlib
import _pytest.pathlib


resolve_pkg_path_orig = _pytest.pathlib.resolve_package_path

# we consider all dirs in repo/ to be namespace packages
rootdir = pathlib.Path(__file__).parent.resolve()
namespace_pkg_dirs = [str(d) for d in rootdir.iterdir() if d.is_dir()]

# patched method
def resolve_package_path(path):
    # call original lookup
    result = resolve_pkg_path_orig(path)
    if result is not None:
        return result
    # original lookup failed, check if we are subdir of a namespace package
    # if yes, return the namespace package we belong to
    for parent in path.parents:
        if str(parent) in namespace_pkg_dirs:
            return parent
    return None

# apply patch
_pytest.pathlib.resolve_package_path = resolve_package_path

pytest>=6.0

Version 6 removes usages of py.path so the monkeypatching should be applied to _pytest.pathlib.resolve_package_path instead of LocalPath.pypkgpath, but the rest is essentially the same:

# repo/conftest.py

import pathlib
import _pytest.pathlib


resolve_pkg_path_orig = _pytest.pathlib.resolve_package_path

# we consider all dirs in repo/ to be namespace packages
rootdir = pathlib.Path(__file__).parent.resolve()
namespace_pkg_dirs = [str(d) for d in rootdir.iterdir() if d.is_dir()]

# patched method
def resolve_package_path(path):
    # call original lookup
    result = resolve_pkg_path_orig(path)
    if result is not None:
        return result
    # original lookup failed, check if we are subdir of a namespace package
    # if yes, return the namespace package we belong to
    for parent in path.parents:
        if str(parent) in namespace_pkg_dirs:
            return parent
    return None

# apply patch
_pytest.pathlib.resolve_package_path = resolve_package_path
hoefling
  • 59,418
  • 12
  • 147
  • 194
  • This looks good. It will take me a bit of time to get to testing it, but I'll comment further when I've had a chance to do so. – cjs May 07 '18 at 04:24
  • 1
    I just noticed an issue with the pytest docs (and perhaps how they are thinking about this). "pytest will find foo/bar/tests/test_foo.py and realize it is NOT part of a package" is incorrect; `test_foo.py` is indeed part of a package; it's just a namespace package rather than a regular package. A great part of the issue here is that exactly what packages and sub-packages any particular module is a part of is determined by the configuration of the Python interpreter, not by where the files are on disk. – cjs Apr 19 '19 at 04:58
  • 1
    pytest 6.0.0 breaks this monkeypatch in https://github.com/pytest-dev/pytest/commit/ab6dacf1d1e1ff0c5be70a3c5f48e63168168721, since they moved away from `py.path`. However the impl was mostly copy-pasted, so monkeypatching `_pytest.pathlib.resolve_package_path` with basically this same impl on >=6.0.0, except that you get passed a `pathlib.Path` instead of a py.path LocalPath, so you have to use `.parents()` instead of `.parts()`. – nitzmahone Jul 29 '20 at 01:28
  • 1
    @nitzmahone thanks for the hint, updated the answer with a snippet updated for `pytest>=6`. – hoefling Jul 29 '20 at 07:09
3

For pytest >= 6.0.0

The solution from hoefling doesn't work for me anymore. pytest's original resolve_package_path function does find project/util's __init__.py function. As a result, the function proposal of hoefling terminates early with the clause

if result is not None:
    return result  # value is "project/util"

The following modification works for me though:

from typing import Optional
import pathlib
import _pytest.pathlib

resolve_pkg_path_orig = _pytest.pathlib.resolve_package_path

# we consider all dirs in repo/ to be namespace packages
root_dir = pathlib.Path(__file__).parent.resolve()
namespace_pkg_dirs = [str(d) for d in root_dir.iterdir() if d.is_dir()]


# patched method
def resolve_package_path(path: pathlib.Path) -> Optional[pathlib.Path]:
    # call original lookup
    result = resolve_pkg_path_orig(path)
    if result is None:
        result = path  # let's search from the current directory upwards
    for parent in result.parents:
        if str(parent) in namespace_pkg_dirs:
            return parent
    return None

# apply patch
_pytest.pathlib.resolve_package_path = resolve_package_path

Note that

PS: I added typing and refactored variables a bit.

Further reading:

patzm
  • 973
  • 11
  • 23
  • 1
    This worked for me, while @hoefling's did not. Thank you, I've been banging my head against this for a full day. – Paul Price Sep 15 '22 at 17:37
  • So basically this is my answer copied with one line changed. This could easily be a comment instead, as I obviously do not do a daily check for all my asnwers being up to date. – hoefling Jan 13 '23 at 16:14
  • Also downvoted since my impl was not meant for regular packages, only for PEP 420 namespace ones (OP's use case - note there are _no `__init__.py` files_ in the question). This copied answer only "solves" a very specific use case where the root package is namespaced, whereas the direct child is not, nothing more. Add a `project/__init__.py` to the equation and this "solution" will also "not work" as expected. – hoefling Jan 13 '23 at 16:39
  • @hoefling I think the edit queue for your suggestion was full already. If my suggestion is a more general fix, feel free to copy it over into yours and I can delete mine. After all, we are here to give quick solutions, not many ;-) – patzm Jan 21 '23 at 15:02
  • also technically speaking, adding a `project/__init__.py` would trigger many many more issues. One doesn't do that for namespace packages. – patzm Jan 21 '23 at 15:04