2

I have a local module named tokenize.py, masks a standard library module of the same name. I only discovered this when I tried to import an external module (sklearn.linear_model), which in turn does import tokenize and expects to get the standard library module, but gets my local module instead.

This is related to How to access a standard-library module in Python when there is a local module with the same name?, but the setting is different, because applying the above solution would require modifying the external module.

An option would be to rename the local tokenize.py, but I would prefer not to do so as "tokenize" best expresses the module's role.

To illustrate the problem, here is a sketch of the module structure:

   \my_module
      \__init__.py
      \tokenize.py
      \use_tokenize.py

In use_tokenize.py, there is the following import:

import sklearn.linear_model

Which results in the following error when invoking python my_module/use_tokenize.py:

Traceback (most recent call last):
  File "use_tokenize.py", line 1, in <module>
    import sklearn.linear_model
  <...>
  File "<EDITED>/lib/python2.7/site-packages/sklearn/externals/joblib/format_stack.py", line 35, in <module>
    generate_tokens = tokenize.tokenize
AttributeError: 'module' object has no attribute 'tokenize'

Is there any way to suppress local modules when importing an external module?

edit: Added python2.7 as a tag due to comments that the solution varies by Python version

Community
  • 1
  • 1
saffsd
  • 23,742
  • 18
  • 63
  • 67
  • Which version of Python are you using? The rules, and the workarounds, are different. – abarnert Feb 28 '13 at 00:47
  • 2
    But really, you should rename your local `tokenize.py`. A script that automatically fixes all of the `import tokenize` and `tokenize.` in all of your hundreds of scripts and modules should take a few minutes to write and a few seconds to run… – abarnert Feb 28 '13 at 00:48
  • If you create a module with the same name as a standard library module, no one can install your library without masking the standard lib module. The solution is to not give your module that name. (You can still have `tokenize.py` if it's part of a package, because in that case a simple `import tokenize` won't see it.) – BrenBarn Feb 28 '13 at 01:02
  • I appreciate that renaming the module is a solution but that doesn't actually answer the question. Given that the standard library is fairly large, namespace collisions are not that surprising, and the relative imports mechanism addresses this issue. However, how do you deal with third-party code that doesn't use relative imports? – saffsd Feb 28 '13 at 01:16
  • Going to also throw my hat into the "rename tokenize" pile. This may not cause a huge error but namespace collisions with the standard library are easy enough to avoid. – GordonsBeard Feb 28 '13 at 01:18
  • I've edited the motivation for keeping the name, which I hope better motivates the need to be able to mask it for importing an external module. – saffsd Feb 28 '13 at 02:24
  • Relative imports are for parts of a package importing other parts of the same package. If code is "third party" then by definition it is not part of your package, so it can't use relative imports to import your library. It sounds like the problem is that your code makes its modules available in the top-level module namespace instead of inside a package. You don't need to rename `tokenize.py` as long as it is inside a package. Can you give more specific information about the module/package structure of your code and where it is installed? – BrenBarn Feb 28 '13 at 08:47
  • I've added an illustrative example, I hope it makes the question clearer. The issue is that I do -not- want the third party library to import my local module. My understanding is that under the relative import system, standard library modules are imported in preference to local modules when there is a namespace collision, so this would not be an issue - is that much correct? – saffsd Mar 01 '13 at 05:41

2 Answers2

4

The problem is not so much the module name, but that you're running a module like it were a script. When Python runs a script, it adds the script's containing directory as the first element in sys.path, so all module lookups from anywhere will search that directory first.

To avoid this, ask Python to execute it as a module instead:

python -m my_module.use_tokenize

Or, of course, you could just keep executable scripts out of your module hierarchy.

Eevee
  • 47,412
  • 11
  • 95
  • 127
0

The paths that the interpreter searches for modules in are listed in sys.path. To prevent the third-party module from seeing the local module at import, we remove . from the path. This can be achieved by:

import sys
sys.path = sys.path[1:]
import sklearn.linear_model #using the original example.

However, this will not work if the local tokenize has already been imported, and it will also prevent the local tokenize from being imported, even if we restore the old sys.path as follows:

import sys
old_path = sys.path
sys.path = sys.path[1:]
import sklearn.linear_model #using the original example.
sys.path = old_path

This is because the Python interpreter maintains an internal mapping of imported modules, so that later requests for the same module are fulfilled from this mapping. This mapping is global to the interpreter, so import tokenize returns the same module from any code that runs it - which is exactly the behavior we are trying to alter. To achieve this, we have to alter this mapping. The easiest way to do this is to simply delete the relevant entry from sys.modules.

import sys
old_path = sys.path
sys.path = sys.path[1:]
import sklearn.linear_model #using the original example.
sys.path = old_path
del sys.modules['tokenize'] #get of the mapping to the standard library tokenize
import tokenize #cause our local tokenize to be imported    
saffsd
  • 23,742
  • 18
  • 63
  • 67