7

Consider the following hierarchy of three regular packages and their contents:

quick
├── brown
│   ├── fox.py
│   └── __init__.py
├── lazy
│   ├── dog.py
│   └── __init__.py
└── __init__.py

Now suppose there is a function jump in module dog and it is needed in module fox. How should I proceed?

Having recently seen Raymond Hettinger's talk at Pycon 2015 I would the like the function to be directly importable from the root of package lazy, like this:

from lazy import jump 

Also, it seems to me that writing relative imports is more concise and makes the intra-package connections easily visible. Hence, I'd write this into lazy/__init__.py:

from .dog import jump

And this into fox.py:

from ..lazy import jump

But I wonder, is this the right way?

First, importing the name jump in lazy/__init__.py does nothing to prevent it from being imported directly from dog. Can it cause problems if a function is potentially imported from many places? For instance, in unit testing, can we possibly monkey patch the name from a wrong location?

Moreover, IDEs with their auto-import routines seem to prefer importing from the module where the function is defined. I could perhaps override this by putting character _ in front of all module names, but this seems a bit impractical.

Is it otherwise dangerous to bring all names that are needed outside a package to __init__.py? Probably this at least increases the possibility of circular imports. But I guess that if a circular import is encountered there is something fundamentally wrong with the package structure anyway.

What about the relative imports? PEP 8 says that absolute imports are recommended: what does it mean when it says that absolute imports behave better than the relative ones? Can you give me an example?

jasaarim
  • 1,806
  • 15
  • 19
  • Are you sure `jump` belongs in `dog.py`, and not higher up in the hierarchy? – chepner May 14 '15 at 14:15
  • Yes, I'm thinking `lazy` as some sort of a coherent collection of components, some of which are needed also elsewhere in `quick`. – jasaarim May 14 '15 at 14:21
  • 1
    I would recommend be explicit, keep it as simple as possible. Intra-package imports can get a bit crazy over time. So forget all the `__init__.py` magic. If you really want to, you could do that for external user visible names only in the top level `__init__.py`. – suvayu May 14 '15 at 14:29

1 Answers1

4

Explicit interface declaration: If you want to expose the jump function as belonging to the lazy package, then it makes sense to include it lazy.__init__ , as you suggest. That way you're making it clear that it is part of lazy's "public interface". You're also suggesting the other modules are not part of the public interface.

On preventing people/tools from importing it directly from dog: In Python, privacy is up to the consent of the user, you can't forcibly hide anything, but there are conventions.

Using underscores and defining dog._jump() makes it clear that dog doesn't want to expose _jump. And we could assume that any IDE tools should respect this type of convention. At any rate, if dog defines _jump, and lazy exposes jump, then you won't have the problem of not knowing which is imported, because the names are different, so this is explicit, which is considered good in Python.

Here's a nice pointer on this topic: Defining private module functions in python

On relative imports:: those are discouraged by PEP 8, however they were implemented for a reason, and they replaced implicit relative imports. The reason in PEP 8: especially when dealing with complex package layouts where using absolute imports would be unnecessarily verbose.

Final thoughts: in short, if you consider the lazy package to be a library and don't want to expose the internal modules, then I think it makes sense to expose the objects in lazy.__init__. If on the contrary you want people to know there is a dog module, then by all means, let other modules do:

from lazy.dog import jump

or

from ..lazy import jump

If brown and brown.fox will always be packaged and are tightly integrated with lazy then I don't see the difference between absolute and relative, but I would slightly prefer relative, to explicitly indicate you're referring to an internal module.

But if you think they could be split in the future, then relative import doesn't make sense and you'd rather do, depending on the above points:

from lazy.dog import jump
or:
from lazy import jump

Community
  • 1
  • 1
GCord
  • 146
  • 6
  • This is a nicely organized answer. It clarifies well some of my concerns, but I can't really see an obvious way to do this (as [The Zen of Python](https://www.python.org/dev/peps/pep-0020/) suggests). – jasaarim May 16 '15 at 13:27
  • Also, I think that defining functions like `jump` as internal like `_jump` is a bit clumsy because then I would have to define almost everything as internal. But there is a point as you say, because then the names would be different. – jasaarim May 16 '15 at 13:27
  • I guess there's no "obvious" way in the sense of The Zen of Python, because Python supports both ways. Here they kind of muddied that principle. I suggest you look at several standard packages to get an idea of the best practices. For instance, the xml package makes use of relative imports (limited to one dot such as `from .xmlreader import InputSource`), and also includes some classes into the package's `__init__.py` file, and defines variables and classes with an `_` to mark them as private. Most other standard packages don't do that and have an empty `__init__.py`. – GCord May 18 '15 at 21:01