26

I just stumbled across this unexpected behavior in python (both 2.7 and 3.x):

>>> import re as regexp
>>> regexp
<module 're' from '.../re.py'>
>>> from regexp import search
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named 'regexp'

Of course from re import search succeeds, just as it would have before I created the alias. But why can't I use the alias regexp, which is now a known module, as a source for importing names?

This sets you up for a nasty surprise whenever there are multiple variants of a module: Say I am still using Python 2 and I want to use the C version of pickle, cPickle. If I then try to import a name from pickle, it will be fetched from the simple pickle module (and I won't notice since it doesn't throw an error!)

>>> import cPickle as pickle
>>> from pickle import dump
>>> import inspect
>>> inspect.getsourcefile(dump)
'.../python2.7/pickle.py'    # Expected cPickle.dump 

Oops!

Poking around I see that sys.modules includes the real module name (re or cPickle, but not the alias regexp or pickle. That explains how the second import fails, but not why python module name resolution works this way, i.e. what the rules and rationale are for doing it this way.

Note: This was marked as a duplicate of a question that has nothing to do with module aliasing: aliasing is not even mentioned in the question (which is about importing submodules from a package) or the top answers. While the answers to that question provide information relevant to this question, the questions themselves are not even similar IMHO.

Community
  • 1
  • 1
alexis
  • 48,685
  • 16
  • 101
  • 161
  • 1
    because the import mechanic `import X` or `from X import Y` don't check the value of the variable X, it search for a file/folder called 'X' – Copperfield Nov 26 '16 at 21:56
  • 1
    Instead of `from regexp import search` and `from pickle import dump`, why not just `search = regexp.search` and `dump = pickle.dump`? –  Nov 26 '16 at 21:57
  • @suspiciousdog, why _wouldn't_ I use the `from X import Y` form? It's standard syntax and very, very pythonic (just count the mentions in PEP 8). – alexis Nov 26 '16 at 22:00
  • 1
    Because in your examples, it doesn't work as intended? Your "nasty suprise" scenario simply isn't compelling because there's a clear and simple work-around. And as @jmd_dk [pointed out](http://stackoverflow.com/a/40823500/6732794), you can't use the same syntax for importing from a static module name and from a variable module name because it's ambiguous: when should Python interpret the module name literally vs. dynamically? The zen of Python says: _In the face of ambiguity, refuse the temptation to guess_ –  Nov 26 '16 at 22:17
  • 1
    The syntax is `import MODULE as NAME`. A module and a name are not interchangeable is the answer. A module is either a file name, `XXX.py`, or a file in the search path that has the appropriate `__init__.py` in it. – dawg Nov 26 '16 at 22:29
  • @BrenBarn, how is this a duplicate to the [question](http://stackoverflow.com/questions/12229580/python-importing-a-sub-package-or-sub-module) you selected? I understand that the answer states "When you use an import statement it always searches the actual module path", which is relevant here; but neither that question nor the accepted answer mention `import .. as` aliasing at all. – alexis Nov 26 '16 at 22:32
  • It doesn't mention it explicitly, but the answer is what you just quoted. "It always searches the module path." It doesn't search anything else. The answer there mentions that it doesn't search "module objects". When you do `import x as y`, you're just getting a module object called `y`, which is not searched, as all other module objects are not searched. What is searched is always the module path, and not any objects in any namespace. – BrenBarn Nov 26 '16 at 22:35
  • 1
    I understand the relevance of that answer, but shouldn't _the questions_ be related to close as a duplicate? – alexis Nov 26 '16 at 22:37
  • @suspicousdog [sic], I don't think you're getting the concept of "surprise": work-arounds are something you do after you have already been surprised. If it does something you don't intend, and it's not immediately obvious, it's a problem. – alexis Nov 28 '16 at 11:28

4 Answers4

24

In short:

You can think of the loading process in that way:

You can load a module into your program, in the form of a variable. You can name the variable for using the module whatever you want. But, the loading process, is based on the name of the module's file, not "module variables".


Long version:

import re creates a global variable named re that serves as the "module portal", in the way it provides the ability to use the module operations.

Most alike, import re as regex creates such a "portal" under the variable named regex.

But, when looking to create such portal and load the module functionality into it, the importer does not use such references. Instead, it looks for the module in your python \Lib directory, or your current working directory, as a file named re.py (or whatever is the name of the module you import).

The import instructions does not address variables, but files, like #include<stdio.h> in C. They have their "own syntax", and set of instructions, as ruled by the interpreter structure, which is, to that case, the interpretation of re as a file name rather than a variable and as for ruling the name of the module "portal".

That is why regex is an operation alias for the portal for re, but not an importation alias for the module (for that purpose you'll have to use the name of the file).

  • I have used terms like "module portal" and "operation alias" since I have not found any standard terms for these. Most of the modules and importer mechanics is related to the interpreter implementation. In CPython (where the usage of the C API is common among developers), for example, create_module creates modules for the importer (in the form of PyObjects) using the provided specifications for the module, and the PyModule_NewObject and PyModule_New functions for the module instance creation that bears the module attributes. These can be viewed in the C API modules decumentation.

  • When I mentioned the term "portal" as a way to reference the variable created by the import statement, I meant to refer to it as a static portal, not a dynamic one. A change in the module file will not reflect in a running program that already imported it (as long as it didn't reload it), as it will load a copy of the module and use it, rather than asking the module file for the operations when encountering need.


Here is pretty much how the variable loading goes realtime:

>>> import re
>>> re
<module 're' from 'C:\\Programs\\Python35\\lib\\re.py'>
>>> import re as regex
>>> regex
<module 're' from 'C:\\Programs\\Python35\\lib\\re.py'>

You can see that re is the module referenced, and it was loaded from the file C:\Programs\Python35\lib\re.py (may change depending on where your python is installed).

Uriel
  • 15,579
  • 6
  • 25
  • 46
  • Additionally, you can't use `re.find` for example if you do `import re as regexp`, you must use `regexp.find`. – Eli Sadoff Nov 26 '16 at 21:44
  • Can you give any references for the concepts of "module portal" (or "trigger"), and "operation alias"? I don't recall coming across them when reading about Python's principles. – alexis Nov 26 '16 at 22:35
5

You cannot treat the module name in import statements as variables. If that was the case, surely your initial import would fail because re is not yet a declared variable. Basically the import statement is semantic sugar; it is a statement of its own with its own rules.

One such rule is this: The written module name is understood as if it was a string. That is, it does not lookup a variable with the name re, instead it uses the string value 're' directly as the sought after module name. It then searches for a module/package (file) with this name and does the import.

This is the only situation (Edit: Well, see the discussion in the comments...) in the language where this behavior is seen, which is the cause of the confusion. Consider this alternative syntax, which is much more in line with the rest of the Python language:

import 're'
# Or alternatively
module_name = 're'
import module_name

Here, variable expansion is assumed in the import statement. As we know this is not the syntax which was actually chosen for the import statement. One can discuss which syntax is the better one, but the above is definitely more harmonious with the rest of the language syntax.

jmd_dk
  • 12,125
  • 9
  • 63
  • 94
  • 2
    I think this answer explains the _why_ / rationale better than the other answers which simply address the mechanics. –  Nov 26 '16 at 21:55
  • Regarding your update: the `import` statement performs a name binding, so it requires a valid identifier, not just any arbitrary string. Supporting a hypothetical syntax such as `import ` would suggest more flexibility than is possible. It is in harmony with other syntax involving name-bindings, such as `del`, `global`, `nonlocal`. And if we're suggesting things in hindsight, I would've preferred a more general ability to reference the string of a name being bound, for use cases such as `Record = namedtuple('Record', <...>)`, `uid = data['uid']`, etc. –  Nov 27 '16 at 00:36
  • I do not agree that `del` and `nonlocal` statements operate on raw identifiers directly, in the same sense as `import`. In the case of the first two, you supply an existing object as the argument (just as with any function call), and some operation is done. In the case of `import`, you supply an identifier which does *not* map to any existing object. You are right in the case of `global`; the identifier also does not have to map to any existing object, making my previous statement "This is the only situation in the language where this behavior is seen" false. I guess we now have two situations – jmd_dk Nov 27 '16 at 00:49
  • Also, as you point out, in the `import` statement the identifier is used both for the file name and it is used as the variable name holding the imported module, limiting valid identifiers to those of valid variable names, but this is irrelevant to the argument. One could remove this degeneracy by importing via `re_module = __import__('re')`. This allows forusing modules which are stored in files with filenames which are invalid as variable names. – jmd_dk Nov 27 '16 at 00:59
1

When from import is used python tries to look in the from file to import what you have requested. This might make it clearer.

import re as regexp

from regexp import search 

This essentially asks python to look in a file called 'regexp' which it can't find. This is why the alias won't work.

1

To get a definite answer on this you'll have to ask the designers themselves but, I think you're asking the wrong question.

The question shouldn't be: Why is it done this way?" but, it should be, what would be the benefit of doing it the way you're asking? Surely it can be done but why should it?

As is the import statement is dead simple and very intuitive, you give it a file name, it tries to finds load it up. You even get fancy as and from but, the concept is simply, you write filenames and you let it be.

What would obfuscating it and making it harder understand achieve, the only achievement is making things arguably more complex.

Python has a history of looking for the rationale behind changes to its design, people asking why aren't function objects subclassable will get a "Why should they?" reply; this behavior doesn't really have a use-case. As is, the import is simple, intuitive and reminiscent of including/using files in other languages.

Dimitris Fasarakis Hilliard
  • 150,925
  • 31
  • 268
  • 253