4

I read the docs and quite some stackoverflow posts, but did not find an explicit answer to my doubts.

I think I understand what namespace packages are for.

I am only interested in Python>=3.3 and the implicit namespace packages - folders without the __init__.py.

Questions

  1. Are namespace packages supposed to contain only other packages, or modules (i.e. .py files) are also "allowed"?

  2. Are namespace packages supposed to be used only as "container" packages, or can they also be contained in regular packages ?

  3. If namespace packages make only sense as containers, I guess I could state that whenever I have a real package folder, all its subfolders containing python modules should also have the __init__.py ?

# this is fine
ns_package/
+-- real_package/
   +-- __init.py__

# how about this?
real_package/
+-- __init.py__  # I have it for docs AND want to force the dir to be a real package
+-- ns_package/  # I would just like to avoid an empty __init__.py
   +-- amodule.py

I suspect that namespace packages only make sense as containers, because in the other case I would not be able to extend the namespace with other things in a different path, since the parent is a real package that must be defined in a single point of the file system. And therefore I would't get the primary advantage of namespace packages.

Context

I am asking because the case with an implicit namespace package inside a regular package works perfectly fine when running and importing modules (from the root of the project). However, it requires some tweaking of the setup script for installation, and I wonder whether I am doing something flawed in the first place.

Note: I am trying to use implicit namespace packages primarily not because I want to exploit their features, but because I hate empty __init__.py files. I initially thought that python 3.3 finally got rid of that, packages do not need __init__.py anymore, but it seems it is not that simple...

L. Bruce
  • 170
  • 7

1 Answers1

6

To begin with: Your motivation for using namespace packages is flawed. There's nothing wrong with empty __init__.py files; they might be empty now but later can be filled with content. Even if they stay empty that doesn't cause any trouble.

Having said that, technically there's nothing wrong with putting a namespace package inside a regular package. When you perform an import of the form import a.b.c then each component is resolved separately and b can be a namespace package that lives inside a regular package a. Consider the following directory layout:

.
└── a
    ├── b
    │   └── c.py
    └── __init__.py

Then you can import the module c:

>>> import a.b.c
>>> a
<module 'a' from '/tmp/a/__init__.py'>
>>> a.b
<module 'a.b' (namespace)>
>>> a.b.c
<module 'a.b.c' from '/tmp/a/b/c.py'>

As you can see all components are instantiated individually where the namespace a.b's __file__ attribute is set to None.

However this setup prevents the main purpose of namespace packages, namely that they can be split over multiple directories. This is because even though b is a namespace package, it lives inside the regular package a which will be cached in sys.modules and thus prevents the import path from being searched any further. As an example consider the following directory layout:

.
├── dir1
│   └── parent
│       ├── child
│       │   ├── one.py
│       ├── __init__.py
├── dir2
│   └── parent
│       ├── child
│       │   └── two.py
│       └── __init__.py
└── main.py

There are two namespace packages dir1/parent/child and dir2/parent/child. However you can only use one of them, since the regular package dir1/parent prevents access to the other. Let's try the following content for main.py:

import sys

sys.path.extend(('dir1', 'dir2'))

import parent.child.one  # this works

print(sys.modules['parent'])
print(sys.modules['parent.child'])
print(sys.modules['parent.child.one'])

import parent.child.two  # this fails

print(sys.modules['parent.child.two'])

and we'll get the following output:

<module 'parent' from 'dir1/parent/__init__.py'>
<module 'parent.child' (namespace)>
<module 'parent.child.one' from 'dir1/parent/child/one.py'>
Traceback (most recent call last):
  File "main.py", line 11, in <module>
    import parent.child.two
ModuleNotFoundError: No module named 'parent.child.two'

This is because sys.modules['parent'] is a regular package and thus in import parent.child.two the parent component is resolved to that very package, which does have an attribute child but this namespace doesn't contain two. A further search on the import path would be required to find that module.

Removing the two __init__.py files from the above folder structure turns the two regular packages into namespace packages and the above script will work (i.e. import parent.child.two succeeds):

<module 'parent' (namespace)>
<module 'parent.child' (namespace)>
<module 'parent.child.one' from 'dir1/parent/child/one.py'>
<module 'parent.child.two' from 'dir2/parent/child/two.py'>

To answer your questions specifically:

1) You can have .py files at any level of a namespace package hierarchy. So long as it doesn't contain an __init__.py file, it's considered a namespace package and its contents are resolved accordingly. Consider the following directory layout:

.
└── a
    ├── b
    │   ├── c
    │   │   └── three.py
    │   └── two.py
    └── one.py

You can import any of the modules inside any of the namespace packages:

>>> import a.one
>>> import a.b.two
>>> import a.b.c.three
>>> a.b.c
<module 'a.b.c' (namespace)>

2) As detailed above, you can place namespace packages inside regular packages but it doesn't make much sense, since it prevents their intended usage.

3) This depends very much on what you mean by "should". Technically the __init__.py is not required, but it definitely makes a lot of sense.

As noted in the beginning, __init__.py files have a purpose beyond indicating regular python packages, and often they get filled with content too. If not, this is nothing to worry about.

a_guest
  • 34,165
  • 12
  • 64
  • 118
  • Thanks for your answer. I know empty `__init__.py` are totally harmless, it just disturbs me to see them in the filesystem, but this is not a technical argument :). 1) Good, that is what I was already doing myself. I just did not see examples of modules directly inside namespace packages, in the docs, so I was wondering. 2) I understand, that is exactly what I suspected (see my original question); I was looking for a confirmation :). 3) Because of 2), I would then always put the `__init__` in subfolders of real packages, to avoid confusion. – L. Bruce Jul 20 '20 at 10:04
  • The tuple argument to `sys.path.extend()` duplicates the string argument; I assume that `(dir1, dir2)` was intended. And sure, the tree in the example is a valid package structure but it doesn't really show what namespace packages could be used for: consolidating separate "portions" installed into separate directories. – Mike C Mar 22 '22 at 18:49
  • @MikeC Thanks for the comment. I fixed the `extend(...)` typo and added the output for the case when `__init__.py` files are removed (i.e. when `dir{1,2}/parent` are namespace packages). – a_guest Mar 22 '22 at 20:05