3

Sorry for the confusing title. I tried at least! Here is my directory structure:

root\
    mypackage\
        __init__.py
        mymodule.py
    main.py

mymodule.py

print('inside mymodule.py')

__init__.py

print('inside __init__')
from . import mymodule as m

main.py

import mypackage
print(mypackage.m)
print(mypackage.mymodule)  # <--- Why does it work?

and the output:

inside __init__
inside mymodule.py
<module 'mypackage.mymodule' from 'C:\\Users...\\mypackage\\mymodule.py'>
<module 'mypackage.mymodule' from 'C:\\Users...\\mypackage\\mymodule.py'>

in main.py file, when I import mypackage, this label actually refers to the __init__.py file, so I can access all the objects/labels inside that module. It makes sense that mypackage.m works because m is now a symbol inside the __init__.py's global namespace.

But there is no mymodule key/symbol inside __init__.py's namespace because I rebind the mymodule symbol to m label via as m.

question: So why this print(mypackage.mymodule) works without throwing any exception?

Additional information : If I have another module inside the package, let's say temp.py, then print(mypackage.temp) won't work because again mypackage refers to __init__.py.
Also it's interesting for me that if I write print(mymodule) inside the __init__.py and I run the main.py module, it will run correctly.

S.B
  • 13,077
  • 10
  • 22
  • 49

1 Answers1

2

Short answer

After two days playing with import statements and searching the document I found the answer. First I'm going to tell the reason why this is happening, then I explain it in much more details. (Also see the update section at the bottom)

Look at this link from the document, also check the structure tree mentioned in the example in that link:

If __all__ is not defined, the statement from sound.effects import * does not import all submodules from the package sound.effects into the current namespace; it only ensures that the package sound.effects has been imported (possibly running any initialization code in __init__.py) and then imports whatever names are defined in the package. This includes any names defined (and submodules explicitly loaded) by __init__.py.

It is not important how we import the mypackage package in main.py, either import mypackage or from mypackage import * form doesn't matter.

By doing so, Python imports all the names defined in the __init__.py module like the m we saw above, also the submodule explicitly loaded (here mymodule module). Properly speaking it adds 'mymodule' key to the __init__.py's global namespace.

Let's see this in more details:

I'm going to change __init__.py a little bit so that we can run it directly as a main module (Since we used relative import inside we can not do that) then I'm going to print what is inside it's global namespace. (Don't forget to add the mypackage directory with PYTHONPATH)

# __init__.py
print('inside __init__')
from mypackage import mymodule as m
print("--------------------------------")
for k, v in globals().copy().items():
    if not k.startswith('__'):
        print(k)

output :

inside __init__
inside __init__
inside mymodule.py
--------------------------------
mymodule
m
--------------------------------
m

You see "inside __init__" print statement two times because this file is executed two times, once by itself and then by executing this line : from mypackage import mymodule as m

It's obvious that under the dashed lines, we got different outputs. First one both mymodule and m, second one only m.

When we run __init__.py directly, a record is added to the sys.modules named '__main__'. But when we import mypackage, another record is added to the sys.modules named mypackage. The interesting part is both of them points to the same file in the same location BUT the module objects that are created from these files are not the same.

To demonstrate this, I'm going to add a few lines of code inside mymodule.py. This help us seeing these file and modules:

# mymodule.py
print('inside mymodule.py')

import sys
v1 = sys.modules['mypackage']
v2 = sys.modules['__main__']
print(v1)
print(v2)
print(f'v1 is v2: {v1 is v2}')
print(f'v1 == v2: {v1 == v2}')
print(f'v1.__dict__ == v2.__dict__: {v1.__dict__ == v2.__dict__}')
print('mypackage', list(v1.__dict__))
print('__main__', list(v2.__dict__))

Now let's run __init__.py module directly again!

inside __init__
inside __init__
inside mymodule.py
<module 'mypackage' from 'C:\\Users...\\mypackage\\__init__.py'> # these are the same
<module '__main__' from 'C:\\Users...\\mypackage\\__init__.py'> # these are the same 
v1 is v2: False
v1 == v2: False
v1.__dict__ == v2.__dict__: False
mypackage ['__name__', '__doc__', '__package__', '__loader__', '__spec__', '__path__', '__file__', '__cached__', '__builtins__']
__main__ ['__name__', '__doc__', '__package__', '__loader__', '__spec__', '__annotations__', '__builtins__', '__file__', '__cached__']
--------------------------------
mymodule
m
--------------------------------
m

As the output shows, they are not the same at all. v1 is package because it has __path__.

The only case when Python adds 'mymodule' key to the __init__.py's namespace is when we import import mypackage and it loads all the submodules that explicitly loaded by __init__.py.


Update:

I've found a page from documentation which addresses this exactly:
https://docs.python.org/3/reference/import.html#submodules

When a submodule is loaded using any mechanism, a binding is placed in the parent module’s namespace to the submodule object.

S.B
  • 13,077
  • 10
  • 22
  • 49