42

I'm learning Python, and I can't figure out how imports in __init__.py work.

I understand from the Python tutorial that the __init__.py file initializes a package, and that I can import subpackages here.

I'm doing something wrong, though. Could you explain for me (and for future Python-learners) what I'm doing wrong?

Here's a simplified example of what I'm trying to do.

This is my file structure:

package
    __init__.py
    test.py
    subpackage
        __init__.py
        hello_world.py

The contents of hello_world.py:

def do_something():
    print "Hello, world!"

subpackage/__init__.py is empty.

package/__init__.py contains:

import test.submodule.do_something

And finally, test.py contains:

do_something()

This is how I attempt to run hello_world.py using OSX terminal and Python 3:

python test.py

Python then throws the following error:

NameError: name 'do_something' is not defined
daouzli
  • 15,288
  • 1
  • 18
  • 17
Benjamin
  • 1,372
  • 2
  • 12
  • 20
  • I presume the `test.py` you are running is `package/test.py`? If so there is no need that I can see for it to be in a package, and so `package/__init__.py` would seem to be completely irrelevant. – holdenweb Jun 19 '14 at 09:08

3 Answers3

34

You probably already understand that when you import a module, the interpreter creates a new namespace and executes the code of that module with the new namespace as both the local and global namespace. When the code completes execution, the module name (or the name given in any as clause) is bound to the module object just created within the importing namespace and recorded against its __name__ in sys.modules.

When a qualified name such as package.subpackage.module is imported the first name (package) is imported into the local namespace, then subpackage is imported into package's namespace and finally module is imported into package.subpackage's namespace. Imports using from ... import ... as ... perform the same sequence of operations, but the imported objects are bound directly to names in the importing module's namespace. The fact that the package name isn't bound in your local namespace does not mean it hasn't been imported (as inspection of sys.modules will show).

The __init__.py in a package serves much the same function as a module's .py file. A package, having structure, is written as a directory which can also contain modules (regular .py files) and subdirectories (also containing an __init__.py file) for any sub_packages. When the package is imported a new namespace is created and the package's __init__.py is executed with that namespace as the local and global namespaces. So to answer your problem we can strip your filestore down by omitting the top-level package, which will never be considered by the interpreter when test.py is run as a program. It would then look like this:

test.py
subpackage/
    __init__.py
    hello_world.py

Now, subpackage is no longer a sub-package, as we have removed the containing package as irrelevant. Focusing on why the do_something name is undefined might help. test.py does not contain any import, and so it's unclear how you are expecting do_something to acquire meaning. You could make it work by using an empty subpackage/__init__.py and then you would write test.py as

from subpackage.hello_world import do_something
do_something()

Alternatively you could use a subpackage/__init__.py that reads

from hello_world import do_something

which establishes the do_something function inside the subpackage namespace when the package is imported. Then use a test.py that imports the function from the package, like this:

from subpackage import do_something
do_something()

A final alternative with the same __init__.py is to use a test.py that simply imports the (sub)package and then use relative naming to access the required function:

import subpackage
subpackage.do_something()

to gain access to it in your local namespace.

With the empty __init__.py this could also be achieved with a test.py reading

import subpackage.hello_world
subpackage.hello_world.do_something()

or even

from subpackage.hello_world import do_something
do_something()

An empty __init__.py will mean that the top-level package namespace will contain only the names of any subpackages the program imports, which allows you to import only the subpackages you require. This gives you control over the namespace of the top-level package.

While it's perfectly possible to define classes and functions in the __init__.py , a more normal approach is to import things into that namespace from submodules so that importers can just import the top-level package to gain access to its contents with a single-level attribute reference, or even use from to import only the names you specifically want.

Ultimately the best tool to keep you straight is a clear understanding of how import works and what effect its various forms have on the importing namespace.

holdenweb
  • 33,305
  • 7
  • 57
  • 77
  • 2
    Ah, I think I understand now. My mistake was in thinking that "import" worked similarly to "include" in PHP or C, and that __init__.py was code that would always run before a package was used. Thanks for the great answer. – Benjamin Jun 19 '14 at 13:53
  • 2
    A pleasure, and I am glad it helped. I teach this stuff for a living, but when I'm not too busy it's great to rejoin the community and help people out – holdenweb Jun 21 '14 at 17:34
  • 1
    I suspect that importing a module as a sub-package or from a sub-package may result in importing the sub-package twice: import subpackage.hello_world will re-import subpackage even when another import statement imported subpackage. I guess the Google's Python coding guideline mentions this briefly, https://google-styleguide.googlecode.com/svn/trunk/pyguide.html?showone=Packages#Packages – eel ghEEz May 28 '15 at 20:05
  • 2
    This note points out the double-importing danger, https://utcc.utoronto.ca/~cks/space/blog/python/RelativeImportProblem – eel ghEEz May 28 '15 at 20:24
  • NL;DR it's not a bad approch to use other packages in the package you are creating, but be careful not to externalize them. – Sandburg Jan 24 '19 at 16:08
  • What would be the best way to add imports that are used across multiple files in a module if you can't put it in __init__.py? I really would rather not add it to each file individually. – Dustin K Jun 04 '20 at 12:39
  • What gave you the impression you can't put it in `__init__.py`? Since you asked, I've edited the answer to explicitly address this issue - hope it helps. – holdenweb Jun 05 '20 at 12:27
4

First, you have to understand how import alone work:

import test.submodule.do_something

Would try to load do_something from submodule itself loaded from test.

You want to load something from subpackage, so start with that:

import subpackage

Fine, subpackage/__init__.py is loaded.

Now, you want the do_something() function which is in the file (a "module") hello_world.py. Easy:

from subpackage.hello_world import do_something

And you are done! Just read this line loud, it does exactly what it says: import do_something from the module hello_world which is in the subpackage package.

Try that in test.py

from subpackage.hello_world import do_something

do_something()

It should work just fine.

Now, the second issue:

__init__.py won't be called in package/ since you don't use package/ as a package. __init__.py will be used if you do an import of package/ or anything in it, for eg:

from package import test

Otherwise, it won't be loaded at all.

However, if you want to load do_something() on the import of subpackage, put from submodule.hello_word import do_something in subpackage/__init__.py, and then, in you test.py, do a import subpackage.

evuez
  • 3,257
  • 4
  • 29
  • 44
  • I don't believe you can import an individual function, which is what `import test.submodule.do_something` attempts to do. You would instead be forced to use `import test.submodule` if you wanted the reference `test.submodule.do_something` to be valid. If you `import os.path`, you will find that the `os` module is present in the importing namespace - where else could `os.path` live? – holdenweb Jan 24 '19 at 16:27
2

It's an absolute hard-and-fast rule in Python that a name must always be defined or imported within the module where you're using it. Here you never import anything inside test.py - so as the error says, do_something is not defined.

Even if your package/__init__.py file was executed (which, as others have pointed out, it isn't), your code still wouldn't work as it is, because the import of do_something has to be done inside test.py if you want to reference it in that file.

Daniel Roseman
  • 588,541
  • 66
  • 880
  • 895