0

I want to create a git repo that can be used like this:

git clone $PROJECT_URL my_project
cd my_project
python some_dir/some_script.py

And I want some_dir/some_script.py to import from another_dir/some_module.py.

How can I accomplish this?

Some desired requirements, in order of decreasing importance to me:

  1. No sys.path modifications from within any of the .py files. This leads to fragility when doing IDE-powered automated refactoring.

  2. No directory structure changes. The repo has been thoughtfully structured.

  3. No changes to my environment. I don't want to add a hard-coded path to my $PYTHONPATH for instance, as that can result in unexpected behavior when I cd to other directories and launch unrelated python commands.

  4. Minimal changes to the sequence of 3 commands above. I don't want a complicated workflow, I want to use tab-completion for some_dir/some_script.py, and I don't want to spend keystrokes on extra python cmdline flags.

I see four solutions to my general problem described here, but none of them meet all of the above requirements.

If no solution is possible, then why are things this way? This seems like such a natural want, and the requirements I list seem perfectly reasonable. I'm aware of a religious argument in a 2007 email from Guido:

I'm -1 on this and on any other proposed twiddlings of the __main__ machinery. The only use case seems to be running scripts that happen to be living inside a module's directory, which I've always seen as an antipattern. To make me change my mind you'd have to convince me that it isn't.

But not sure if things have changed since then.

dshin
  • 2,354
  • 19
  • 29

1 Answers1

1

Opinions haven't changed on this topic since Guido's 2007 comment. If anything, we're moving even further in the opposite direction, with the additions of PYTHONSAFEPATH var and corresponding -P option in 3.11:

These options will nerf direct sibling module imports too, requiring sys.path to be explicitly configured even for scripts!

So, scripts still can't easily do relative imports, and executable scripts living within a package structure are still considered an anti-pattern. What to do instead?! The widely accepted alternative here is to use the packaging feature of entry-points. One type of entry-point group in packaging metadata is the "console_scripts" group, used to point to arbitrary callables defined within your package code. If you add entries in this group within your package metadata, then script wrappers for those callables will be auto-generated and put somewhere on $PATH at pip install time). No hacking of sys.path necessary.

That being said, it's still possible to run .py files directly as scripts, provided you've configured the underlying Python environment for them to resolve their dependencies (imports) correctly. To do that, you'll want to define a package structure and "install" the package so that your source code is visible on sys.path.

Here's a minimum example:

my_project
├── another_dir
│   ├── __init__.py       <-- __init__ file required for package dirs (it can be empty)
│   └── some_module.py
├── pyproject.toml        <-- packaging metadata lives here
└── some_dir              <-- no __init__ file necessary for non-packaged subdirs
    └── some_script.py

Minimal contents of the packaging definition in pyproject.toml:

[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"

[project]
name = "my_proj"
version = "0.1"

[tool.setuptools.packages.find]
namespaces = false

An additional once-off step is required to create/configure an environment in between the git clone and the script execution:

python3 -m venv .venv
source .venv/bin/activate
pip install -e .

This makes sure that another_dir is available to import from the environment's site-packages directory, which is already one of the locations on sys.path (check with python -m site). That's what's required for any/all of these import statements to work from within the script file(s)

from another_dir import some_module
import another_dir.some_module
from another_dir.some_module import something

Note that this does not necessarily put the parent of another_dir onto sys.path directly. For an editable install, it will setup some scaffolding which makes your package appear to be "installed" in the site, which is sufficient for those imports to succeed. For a non-editable install (pip install without the -e flag), it will just copy your package directly into the site, compile the .pyc files, and then the code will be found by the normal SourceFileLoader.

wim
  • 338,267
  • 99
  • 616
  • 750
  • Thanks, Wim! Question on anti-pattern notion: what about unit tests? It seems natural to place unit testing scripts within a package in a separate directory from the modules that they test. – dshin Nov 30 '22 at 22:33
  • @dshin Unit tests are usually functions, not scripts. In pytest, the test collection phase will go and collect all the functions named like `test_*` from within modules named like `test_*.py` and run them, perhaps in parallel, so you don't really execute the test modules directly. They'd usually be in a separate `tests` subdirectory, not located within the package (so that they are available in a development checkout, but not in a regular install) – wim Dec 01 '22 at 04:12
  • I think this answer is not quite complete. Python actually did add https://docs.python.org/3/library/importlib.html#importlib.abc.FileLoader which would allow OP to do just what he has asked for and fulfil all his requirements. I've wrapped this approach in an experimental library which gives you more control over your imports: https://github.com/ronny-rentner/ultraimport –  Dec 07 '22 at 05:39
  • @Ronny Presumably dshin doesn't want to change import statements in the code into function calls. But if you think such an approach is worth considering, then by all means post an answer - it would have much better visibility than as a comment on this answer. – wim Dec 07 '22 at 06:09
  • @wim Hmm, where did he say that he doesn't want to change the imports? I must have missed it. As he is already happy with your answer and has accepted it, I would only post another answer if he thinks it's worth it. –  Dec 07 '22 at 09:08
  • @Ronny That's why I wrote "presumably". Anyway, answers on SO are not just for the original poster, often they are useful to other visitors arriving in the search too. – wim Dec 07 '22 at 20:57