1

I have a hierarchy for a package like this:

test_script.py
package_name/
     __init__.py
     functionality_1.py
     functionality_2.py

For testing purposes, in addition to the functions in functionality_1.py, there is a section to run it as main. I use that for debugging a developing that functionality of the package. At the bottom of functionality_1.py is a standard main like this:

if __name__ == "__main__":
     # Do some stuff

I would like logging from the functions in functionality_1.py to use a logger package_name.functionality_1, and functionality_2.py use a logger named package_name.functionality_2

I tried what I've seen in examples, using

logger = logging.getLogger(__name__)

but if I'm running file functionality_1.py with python -m package_name.functionality_1, the logger is always named __main__.

I'd rather not hardcode logger names, but I'm not sure what the best way to do this is.

Where do you create loggers, so each xxxx.py has it's own logger? __init__.py does seem like the right place.

Is it bad form putting a main function in my package files?

bpeikes
  • 3,495
  • 9
  • 42
  • 80
  • The script you are calling originally is `__main__`, all packages and module you are importing will get a name in dot notation. That's the point where `logger = logging.getLogger(__name__)` starts to make sense. – Klaus D. Jul 18 '23 at 20:59
  • 2
    Do note that `__main__` and `package_name.functionality_1` are going to be two different modules, even when you start Python as `python -m package_name.functionality_1`. You need to be careful with how you separate entrypoints from library code, or that can come back to bite you – Brian61354270 Jul 18 '23 at 21:02
  • 2
    Since you mention this is for a package, you should check whether your build system has support for entrypoint scripts. Most modern build systems do. Then you can bypass the `__main__` shenanigans and just let you build system generate a `__main__` for you that calls some chosen function, like `functionality_1.main` – Brian61354270 Jul 18 '23 at 21:13
  • Another route to consider is making the `functionality_x` modules subpackages with `__main__.py` files. Then just put something like `from package_name.functionality_x import main; main()` in the `__main__.py`s – Brian61354270 Jul 18 '23 at 21:16
  • Where/how are you configuring the logging system? You won't even see the logger name rendered unless you have a logging configuration call somewhere.. – wim Jul 18 '23 at 21:19
  • @wim - Configuration of the logging system would occur at the top of any __main__ function – bpeikes Jul 18 '23 at 22:18
  • @Brian61354270 - What do you mean by “be careful with how you separate entrypoints from library code, or that can come back to bite you” – bpeikes Jul 18 '23 at 22:23
  • @bpeikes If any part of your library code directly or indirectly imports the entrypoint module, you'll end up with two copies of the same module loaded. This can be problematic if the modules have any state (think global variables, @register-ing decorators, etc) or if they declare any types that later get inspected (think two different `class Foo:`s that fail `isinstance` checks with each other) – Brian61354270 Jul 18 '23 at 22:36
  • I agree with @Brian61354270 that you probably want to use entrypoints in the console_scripts group. Creating `__main__.py` functions which import main and call it is a way to make `python3 -m` work in a similar way, but it can be error-prone due to the necessity to carefully isolate the top-level code environment. – wim Jul 19 '23 at 03:30
  • You can use Path( _ _file_ _ ).stem from pathlib – sureshvv Jul 19 '23 at 03:31
  • 2
    @sureshvv **No you can not**. Please add your `Path(__file__)` suggestion as an answer instead of a comment, so that I can downvote that. It's crucial in the logging framework for the loggers to have a fully qualified _package name_, with periods, not a filename separated with slashes. – wim Jul 19 '23 at 03:32
  • @wim do you know pathlib? – sureshvv Jul 19 '23 at 03:36
  • 1
    @sureshvv Yes, been using it many years. I'm actually the [top pathlib answerer](https://stackoverflow.com/tags/pathlib/topusers) on Stack Overflow. – wim Jul 19 '23 at 03:44
  • Have a look at this [explanation on configuring package, module, class and instance loggers](https://stackoverflow.com/a/50751987/325452). – ingyhere Aug 06 '23 at 07:43

2 Answers2

1

Is it bad form putting a main function in my package files?

It is not necessarily bad for your submodules to have a main function defined, but if you want the logger names to be something other than "__main__" then you'll want to avoid executing the submodules as top-level code. Do not try to fight or fiddle the __main__ machinery, this part of the language is inflexible and it's unlikely to change because Guido considers executing submodules directly as scripts an anti-pattern.

Instead, you want the top-level code environment to be a wrapper script which imports main and executes it. This way, the normal pattern of logger = logging.getLogger(__name__) will work as desired, creating a proper hierarchy of named loggers for your subpackages. The logging tree heirarchy is important so that logging handlers can be configured recursively, and loggers can propagate up to a single node (i.e. the root logger).

There is a Python packaging feature called Entry Points which can be used to generate these wrapper scripts automatically. I'll demonstrate how to define them with setuptools, but all other major build systems support them nowadays since they were specified years ago in PEP 621 – Storing project metadata in pyproject.toml.

In your submodules, you will have something like (simplified):

# package_name/functionality_1.py
import logging

log = logging.getLogger(__name__)

def main():
    logging.basicConfig(format="%(name)s:%(message)s", level=logging.INFO)
    log.info("hello")

And in your pyproject.toml file, add a section with:

[project.scripts]
functionality-1 = "package_name.functionality_1:main"
functionality-2 = "package_name.functionality_2:main"

The installer (usually pip) will autogenerate wrapper scripts and put them on the $PATH. These scripts are what you should invoke instead of executing submodules directly as __main__, i.e. instead of using:

python3 -m package_name.functionality_1
python3 -m package_name.functionality_2

You will call:

functionality-1
functionality-2

You can remove all the if __name__ == "__main__: blocks from the source code, they are not necessary. For development/testing, install your package as editable by using pip install -e . (it is this installation command which will generate the wrapper scripts).

Take a look at the contents of the wrapper scripts if you like, to see how it works. You'll find them at the location specified by sysconfig.get_path("scripts"). They'll have executable mode bits set, and if you're on macOS/Linux they'll have a #!shebang pointing at the corresponding Python environment (i.e. the runtime used for the installation). It is these scripts which execute as the top-level code, and they'll just import the main function(s) from your package code and call them.

wim
  • 338,267
  • 99
  • 616
  • 750
-1

You can try:

import Path from pathlib
import logging
.
logger = logging.getLogger(Path(__file__).stem)

You can use the inspect module to get the package name also (see Get full package module name) but I suspect that you will be happy with just the module name.

Brian61354270
  • 8,690
  • 4
  • 21
  • 43
sureshvv
  • 4,234
  • 1
  • 26
  • 32
  • 1
    I downvoted because the loggers are named incorrectly. This drops the package name. The logger which should be named `"package_name.functionality_1"` will be named just `"functionality_1"` using this method. – wim Jul 19 '23 at 03:41
  • 2
    The reason that the fully qualified name is important, is that it should be possible to configure the handlers of both `package_name.functionality_1` and `package_name.functionality_2` by configuring the `package_name` node of the logging tree. If you don't have fully qualified package names then you don't have a configurable logging tree at all, you just have "flat" bunch of loggers all pointing directly at the root logger. Also, any submodules with the same filename will use the same logger, rather than being namespaced, which means they can't be configured or even identified separately. – wim Jul 19 '23 at 03:56
  • This also suffers from the same problems as the other (now deleted) answer that used `__file__` did. You'll need to carefully process `__file__` or you'll end up with surprising logger names sometimes, like `M.cpython-XYZ.opt-N` – Brian61354270 Jul 19 '23 at 13:21