1

Python 2.7 does something very peculiar when you look at it under the hood for importing within a package. Basically, it is storing relative imports of built-ins (among other things) for reasons that I truly cannot understand. Minimal use case is below.

Assume a directory structure in the form:

\BaseFolder
   * __init__.py
   * MainFile.py
   \TestFolder
       * __init__.py
       * TestModule.py

Both __init__.py are empty. The MainFile.py says only:

import TestFolder.TestModule
import sys
for x in sorted(sys.modules.keys()):
    print x

The TestModule.py says:

import os

Running the MainFile.py with Python gives you the list of imported modules. When you look through the keys for the modules, there's a bunch of junk, but you can find the following keys:

TestFolder
TestFolder.TestModule
TestFolder.os
...
os

If you look at the values for those modules, TestFolder.os is None. But why does it exist in the first place? Why would the modules list register a module that has been demonstrated to not exist when it looked for it? I assume that this occurs because the system checks for an "os" library in TestFolder first (hence TestFolder.os), then looks in the built-ins. By why add an entry just because you checked? Does anyone have insight into why Python would do this? Maybe just so it never checks for libraries in those locations again?

Charles
  • 50,943
  • 13
  • 104
  • 142
Namey
  • 1,172
  • 10
  • 28
  • It's a sentinel to show that the relative module `TestFolder.os` does not exist. This makes future imports then faster. This is a dupe question, I'll find the original shortly. – Martijn Pieters May 15 '13 at 16:46
  • I agree that this looks like a duplicate symptom of the dummy modules issue, now that it has been stated. Voted to close. – Namey May 15 '13 at 20:16

1 Answers1

1

The cache includes misses as well as hits. TestFolder.os==None just means that python looked for a package-relative module called "os", didn't find it, and proceeded down the line. It caches the package absolute name "TestFolder.os" because that's what other modules would call it. It sets the value to None so that other module's imports don't have to check the file system again.

tdelaney
  • 73,364
  • 6
  • 83
  • 116
  • That was what I was figuring. The root of my question was the misconception that sys.modules was an actual listing of modules, rather than a cache to record the results both hits and misses. – Namey May 15 '13 at 20:19
  • To be fair, however, the related Python docs on this are horrible: "This is a dictionary that maps module names to modules which have already been loaded" (http://docs.python.org/2/library/sys.html). If they could just call it a cache of modules that the system has tried to load (using its normal order of resolution), they could avoid some headscratchers. – Namey May 15 '13 at 20:22
  • @Namey - true, experimenting and getting tripped up with edge conditions are all part of the game with python! – tdelaney May 15 '13 at 20:30