1

I would like a way to detect if my module was executed directly, as in import module or from module import * rather than by import module.submodule (which also executes module), and have this information accessible in module's __init__.py.

Here is a use case:

In Python, a common idiom is to add import statement in a module's __init__.py file, such as to "flatten" the module's namespace and make its submodules accessible directly. Unfortunately, doing so can make loading a specific submodule very slow, as all other siblings imported in __init__.py will also execute.

For instance:

module/
   __init__.py
   submodule/
      __init__.py
      ...
   sibling/
      __init__.py
      ...

By adding to module/__init__.py:

from .submodule import *
from .sibling import *

It is now possible for users of the module to access definitions in submodules without knowing the details of the package structure (i.e. from module import SomeClass, where SomeClass is defined somewhere in submodule and exposed in its own __init__.py file).

However, if I now run submodule directly (as in import module.submodule, by calling python3 -m module.submodule, or even indirectly via pytest) I will also, unavoidably, execute sibling! If sibling is large, this can slow things down for no reason.

I would instead like to write module/__init__.py something like:

if __???__ == 'module':
   from .submodule import *
   from .sibling import *

Where __???__ gives me the fully qualified name of the import. Any similar mechanism would also work, although I'm mostly interested in the general case (detecting direct executing) rather than this specific example.

yawn
  • 422
  • 1
  • 5
  • 21
  • Refer to [What does `if __name__ == "__main__":` do?](https://stackoverflow.com/questions/419163/what-does-if-name-main-do) - you may be looking for `if __name__ != '__main__':`, for the opposite effect to run if executed directly, which may be what you are looking for. – metatoaster Nov 24 '20 at 06:13
  • Yes, this is quite similar to what I would need but it doesn't quite work (from what I understand). When importing a module using `import module`, the `__name__` is set to `module` which is indistinguishable from when doing `import module.submodule`, which also sets it to `module`. – yawn Nov 24 '20 at 08:30
  • This is simply because doing `import module.submodule` implicitly does an `import module` first - that's why after executing `import module.submodule`, `module` gets assigned the appropriate reference to the parent module of `module.submodule` (i.e. `module`). – metatoaster Nov 24 '20 at 08:56
  • That's what I thought, but I don't see where to go from here. Is there any way of accessing the whole `module.submodule` string somewhere? I've looked into all the "dunder" variables and nothing seems to help. – yawn Nov 24 '20 at 19:27
  • There is essentially no way to stop execution of anything inside `module/__init__,py`, nor will any special indication be presented to it upon execution of `import module.submodule` from anywhere. – metatoaster Nov 24 '20 at 23:22

2 Answers2

2

What is being desired is will result in undefined behavior (in the sense whether or not the flattened names be importable from module) when we consider how the import system actually works, if it were actually possible.

Hypothetically, if what you want to achieve is possible, where some __dunder__ that will disambiguate which import statement was used to import module/__init__.py (e.g. import module and from module import *, vs import module.submodule. For the first case, module may trigger the subsequent (slow) import to produce a "flattened" version of the desired imports, while the latter case (import module.submodule) will avoid that and thus module will not contain any assignments of the "flattened" imports.

To illustrate the example a bit more, say one may import SiblingClass from module.sibling.SiblingClass by simply doing from module import SiblingClass as the module/__init__.py file executes from .sibling import * statement to create that binding. But then, if executing import module.submodule resulting in the avoidance of that flatten import, we get the following scenario:

import module.submodule
# module.submodule gets imported
from module import SiblingClass
# ImportError will occur

Why is that? This is simply due to how Python imports a file - the source file is executed in its entirety once to assign imports, function and class declarations to the designated names, and be registered to sys.modules under its import name. Importing the module again will not execute the file again, thus if the from .sibling import * statement was not executed during its initial import (i.e. import module.submodule), it will never be executed again during subsequent import of the same module, as the copy produced by the initial import assigned to its module entry in sys.module is returned (unless the module was reloaded manually, the code for the module will be executed again).

You may verify this fact by putting in a print statement into a file, import the corresponding module to see the output produced, and see that no further output will be produced on subsequent import of that module (related: What happens when a module is imported twice?).

Effectively, the desired functionality as described in the question cannot be implemented in Python.

A related thread on this topic: How to only import sub module without exec __init__.py in the package

metatoaster
  • 17,419
  • 5
  • 55
  • 66
  • 1
    This makes perfect sense, thanks for your detailed answer. For this specific problem I have settled on creating an extra file named something like `module.all` and doing `from module.all import *` whenever this type of functionality is needed. – yawn Nov 25 '20 at 02:26
0

This is not a complete solution, but standalone py.test (ignore __init__.py files) proposes setting a global flag to detect when in test. This corrects the problem for tests at least, provided the concerned modules don't call each other.

yawn
  • 422
  • 1
  • 5
  • 21