2

I was playing with examples in order to answer a question posted here on SO and found hard to understand the mechanics through which python's import * messes up the scope.

First a bit of context: this question does not deal with practical issue; I understand well that from foo import * is frowned upon (rightly so) and I grasp that it is for reasons deeper than clarity in the code. My interest here is in understanding the mechanics that causes bad behaviour with circular import *s. In other words, I understand that the observed behaviour is expected; I don't understand why.

The situation I'm not able to understand are the problems that arise when having, from an imported module (b), a reference to the importing module (a), using *. I managed to observe subtle differences in behaviour when the importing module uses * or not, but the overall (bad) behaviour is the same. I couldn't find any clear explanation neither in the documentation nor on SO.

Investigating the behaviour through what is available on the scope, I managed to build a small example that illustrates the differences in its content based on the above mentioned question and a few searches I did here in SO and elsewhere. I try to demonstrate as concisely as I can. All code and experiments below were done with python 2.7.8.


Working scenarios

First a trivial module containing a trivial module containing one class, a.py:

class A:
    pass

A first variant of client code, importing module a, b_v1.py:

from pprint import pprint

def dump_frame(offset=0):
    import sys
    frame = sys._getframe(1+offset)
    d = frame.f_globals
    d.update(frame.f_locals)
    return d

print 'before import v1'
pprint (dump_frame())

import a

print 'after import v1'
pprint (dump_frame())
print a.A()

Second variant of the same code, importing * from module a, b_v2.py:

from pprint import pprint

def dump_frame(offset=0):
    import sys
    frame = sys._getframe(1+offset)
    d = frame.f_globals
    d.update(frame.f_locals)
    return d

print 'before import v2'
pprint (dump_frame())

from a import * 

print 'after import v2'
pprint (dump_frame())
print A()
  • Running both b_v1 and b_v2 produce the same output before the import, and both are able to instantiate A, as expected. After the import, however, again, as expected, they differ. I highlight the difference:

b_v1.py, has in the scope

'a': <module 'a' from '.../stackoverflow/instance/a.py'>

while b_v2.py does not, but has

'A': <class a.A at 0x...>
  • Both before and after the import, the scope contains __builtins__ set to <module '__builtin__' (built-in)>.

  • Both variants succeed in instantiating A.


Not working scenarios

The intriguing behaviour is when changing a.py to contain a circular reference to b (in both the b_v1 and b_v2 variants).

Adjusted code of a.py:

from b_v1 import *
class A:
    pass

(for shortness's sake, only one case of a.py is shown; obviously in the case of b_v2.py the import is for this module, not b_v1.py)

In my observations of the contents of the scope in the scenario with a circular reference, I see:

  • In both variants, before the import in a, __builtins__ is similar to the cases above. After the import, however, it is changed and contains a dict of

    'ArithmeticError': , 'AssertionError': , 'AttributeError': , ...

which is needlessly long to paste here.

  • The changed __builtins__ is present twice. This I can understand as being consequence of the importing and would probably not happen if the code were inside a function.
  • In variant b_v2 the module a is present in the scope; it is present in variant b_v1.

  • In both variants, instantiation of A fails. Given that in variant b_v1 the module is present in the scope (therefore, I assume was successfully imported), I had expected to be able to instantiate A. This is not the case. There are differences, however: in case b_v1.py, it fails with an AttributeError: 'module' object has no attribute 'A' and, as for b_v2.py, failure is a NameError. In this later case, it is always the same error independently of whether I try to instantiate as A() (as in the working example) of a.A().


Summarizing my questions:

  • Through what mechanics a circular import * messes up the scope?

  • Why is it not possible to instantiate A in the case b_v1, although the module is in the scope?

Community
  • 1
  • 1
h7r
  • 4,944
  • 2
  • 28
  • 31
  • You may find it illustrative to also try examples that use `from a import A` instead of `from a import *`. The `import *` is not the problem; the problem is trying to access something from `a` when `a` is not fully loaded yet. – BrenBarn Feb 16 '15 at 22:46

1 Answers1

4

Python modules are executed from top to bottom. Import statements are executable just like any other. When an import statement is run, it does these things (simplified for expository purposes, see the language reference for full details):

  1. Check whether the module is listed in sys.modules. If it is, return it immediately
  2. Find the module (usually but not always by searching through the filesystem).
  3. Create an empty entry for the module in sys.modules, with an empty namespace.
  4. Execute the module from top to bottom within the newly-created namespace.

Suppose we have files like this:

a.py:

from b import *
foo = object()

b.py:

from a import *
print(repr(foo))

Further suppose that a.py gets imported first. Let's go through this line-by-line:

  1. Someone else imports a. A reference to a is stored in sys.modules['a'] before we even begin executing it.
  2. a.py runs from b import *. This translates to "import b and then grab everything out of b's namespace into a's namespace."
  3. Python places an empty module object in sys.modules['b']
  4. b.py runs from a import *. Python imports a.
  5. The import of a returns immediately since sys.modules['a'] exists.
  6. Since a.py hasn't yet executed foo = object(), a.foo doesn't yet exist, so it cannot be dumped into b's namespace.
  7. b.py crashes on a NameError.
Kevin
  • 28,963
  • 9
  • 62
  • 81