3

I have a question in how to properly create a path in Python (Python 3.x).

I developed a small scraping app in Python with the following directory structure.

root
├── Dockerfile
├── README.md
├── tox.ini 
├── src
│   └── myapp
│       ├── __init__.py
│       ├── do_something.py
│       └── do_something_else.py
└── tests
    ├── __init__.py
    ├──  test_do_something.py
    └──  test_do_something_else.py

When I want to run my code, I can go to the src directory and do with python do_something.py But, because do_something.py has an import statement from do_something_else.py, it fails like:

Traceback (most recent call last):
  File "src/myapp/do_something.py", line 1, in <module>
    from src.myapp.do_something_else import do_it
ModuleNotFoundError: No module named 'src'

So, I eventually decided to use the following command to specify the python path: PYTHONPATH=../../ python do_something.py to make sure that the path is seen.

But, what are the better ways to feed the path so that my app can run? I want to know this because when I run pytest via tox, the directory that I would run the command tox would be at the root so that tox.ini is seen by tox package. If I do that, then I most likely run into a similar problem due to the Python path not properly set.

Questions I want to ask specifically are:

  1. where should I run my main code when creating my own project like this? root as like python src/myapp/do_something.py? Or, go to the src/myapp directory and run like python do_something.py?
  2. once, the directory where I should execute my program is determined, what is the correct way to import modules from other py file? Is it ok to use from src.myapp.do_something_else import do_it (this means I must add path from src directory)? Or, different way to import?
  3. What are ways I can have my Python recognize the path? I am aware there are several ways to make the pass accessible as below:

    a. write export PYTHONPATH=<path_of_my_choice>:$PYTHONPATH to make the path accessible temporarily, or write that line in my .bashrc to make it permanent (but it's hard to reproduce when I want to automate creating Python environment via ansible or other automation tools)

    b. write import sys; sys.path.append(<root>) to have the root as an accessible path

    c. use pytest-pythonpath package (but this is not really a generic answer)

Thank you so much for your inputs!

my environment

OS: MacOS and Amazon Linux 2
Python Version: 3.7
Dependency in Python: pytest, tox
masaaa015
  • 164
  • 1
  • 1
  • 10

2 Answers2

1

I would suggest to use setup.py to make this a python package. Then you can install it in development mode python setup.py develop. This way it will be available in your python environment w/o needing to specify the PYTHONPATH.

For testing, you can simply install the package python setup.py install.

Hope that helps.

bjonen
  • 1,503
  • 16
  • 24
  • Thank you so much for your input! I am still in search of the right way, and I am still debating if I should make such a small application into a package or not, but I understand that that's one way. Would you or your team create a package that easily? Anyway, thank you for your answer! – masaaa015 Feb 18 '20 at 04:22
  • 1
    @user3660294 Yes, make it an actual installable project. I just saw that you updated the question and mention that you use _tox_. So yes, definitely make it an installable project: add a _setuptools_ `setup.py` (or similar build framework). Even a minimal `setup.py` would be enough if you don't plan on distributing this project. It just needs to be good enough to build a _sdist_ and use the _develop_ or _editable_ mode (`python setup.py develop` or `pip install -e .`). Save yourself from the silliness of modifying `sys.path`, `PYTHONPATH` or adding unnecessary `__init__.py` files. – sinoroc Feb 18 '20 at 12:21
  • @sinoroc thank you again for your input. I think I shall try to make it into a package then! – masaaa015 Feb 19 '20 at 04:40
-1

Two simple steps should make it happen. Python experts can comment if this is a good way to do it (especially going by the concluding caution raised towards the end of this post).

I would have done it like below.

First I would have put a "__init__.py" in root so that hierarchy looks like below. This way python will treat the folder as a package.

root
├── Dockerfile
├── README.md
├── tox.ini 
├── __init__.py
├── src
│   └── myapp
│       ├── __init__.py
│       ├── do_something.py
│       └── do_something_else.py
└── tests
    ├── __init__.py
    ├──  test_do_something.py
    └──  test_do_something_else.py

Then in "do_something.py", I would have added these lines at the top. In the second line please put the full path to the "root" directory.

import sys
sys.path += ['/home/SomeUserName/SomeFolderPath/root']
from  src.myapp.do_something_else import do_it

Please note that the second line will essentially modify the sys.path by adding the root folder path (I guess until the interpreter quits). If this is not what you can afford then I am sorry.

Amit
  • 2,018
  • 1
  • 8
  • 12
  • There is almost never a good reason to modify `sys.path` or `PYTHONPATH`. This is ill advised. Most of the issues when people resort to modifying these, can be solved in a much cleaner way by packaging the project properly. If packaging is not possible then, make sure the current working directory is the one containing the _top-level import packages_ and use `python -m ...` accordingly. – sinoroc Feb 17 '20 at 09:54
  • @sinoroc Thank you for the alternate solution but as far as modifying the sys.path is concerned I suggest to have a look at the official docs (https://docs.python.org/3.6/library/sys.html). It says the following about sys.path list. "A program is free to modify this list for its own purposes. Only strings and bytes should be added to sys.path; all other data types are ignored during import." You must be having, however, sufficient experience on this and if you could point to a reference or discussion, I will be thankful. It will help me (and others too) in future. Many thanks. – Amit Feb 17 '20 at 10:09
  • 1
    In short: instead of modifying `sys.path`, make it so that the code you want to import is in one of the locations already listed in `sys.path`. Two are always there: the _site-packages_ directory, and the current working directory. If packaging is done right and project is installed correctly, then its top-level packages are importable from _site-packages_ directory. In some cases packaging is not possible or _overkill_, then `python -m` allows importing packages from the current working directory. Try it yourself. No reference, but some info: https://stackoverflow.com/q/22241420/11138259 – sinoroc Feb 17 '20 at 11:04
  • Thank you @Amit for interesting conversation. I have never seen __init__.py in the root directory and it is a way to hack the Python import system, it seems. Thank you for an interesting suggestion. – masaaa015 Feb 18 '20 at 04:49
  • @sinoroc, thank you for the comment. So you would suggest creating a package just like bjonen suggested above, right? And, would you not add sys.path.append somewhere to add a path? – masaaa015 Feb 18 '20 at 04:49
  • @sinoroc, wow. I didn't know the mechanism of -m option. (I read the link that you wrote) This could be an alternative, and I could run my script using m option, like `PYTHONPATH=root python -m src.myapp`. The only reason I hesitate to not make the app an package is it could be an overkill like you mention. – masaaa015 Feb 18 '20 at 05:17
  • @user3660294 Thank you. Since you are using Python 3.7, I am sharing modules official docs (https://docs.python.org/3.7/tutorial/modules.html). Please go to section 6.4 where you will see __init__.py in the top sound directory. Entire page is a good read but I want to draw your attention to at least this line. "The __init__.py files are required to make Python treat directories containing the file as packages." Hope it helps. Discussion on SO is here as well (https://stackoverflow.com/questions/448271/what-is-init-py-for) – Amit Feb 18 '20 at 06:40
  • 1
    Making it an actual installable project is really not that much more effort, and I would consider it the best practice. Your structure is already in place, you really only need to add a minimal `setup.py` file. It really can be very minimal since you presumably don't want to distribute/publish it. Adding loose `__init__.py` files, or modifying `sys.path` and `PYTHONPATH` really should be just temporary fixes. If you can, then experiment with all these techniques. – sinoroc Feb 18 '20 at 11:22