4

imp.find_module() does not find modules from zipped eggs.

How can find modules which can come from both places: directories or zipped eggs? It is important in my case that I can provide a path argument like imp.find_module() supports it.

Background

Somehow packages get installed twice in our environment. As zipped egg and as plain files. I want to write a check which tells me if a module is installed twice. See https://stackoverflow.com/a/23990989/633961

Community
  • 1
  • 1
guettli
  • 25,042
  • 81
  • 346
  • 663
  • What if your module is deep within a package hierarchy inside a zipped egg? What path do you want in that case? There *is* no file directly equivalent to the module you've requested. – Kevin Mar 13 '15 at 18:01
  • @Kevin imp.find_module() only finds "toplevel" modules. For example you can find "os" but you can't find "path" (like from "os.path"). I just want a find_module() that works like the import statement of the python interpreter does. The interpreter loads zipped eggs. – guettli Mar 14 '15 at 08:09
  • The import mechanism in 2.x is incompletely exposed. Under 3.x, you could get what you want with `importlib`; `imp` is deprecated. Unfortunately, this also means the whole thing is substantially more complicated than `imp` ever was. – Kevin Mar 14 '15 at 13:06
  • @Kevin thank you for the hint. `importlib` exists even in Python2.7. It is a subset, but maybe better than nothing. – guettli Mar 14 '15 at 17:41

1 Answers1

5

Assuming Python 2, the information I think you need is in PEP 302 - New Import Hooks (the PEP is outdated for Python 3, which is completely different in this regard).

Finding and importing modules from ZIP archives is implemented in zipimport, which is "hooked" into the import machinery as described by the PEP. When PEP 302 and importing from ZIPs were added to Python, the imp modules was not adapted, i.e. imp is totally unaware of PEP 302 hooks.

A "generic" find_module function that finds modules like imp does and respects PEP 302 hooks, would roughly look like this:

import imp
import sys

def find_module(fullname, path=None):
    try:
        # 1. Try imp.find_module(), which searches sys.path, but does
        # not respect PEP 302 import hooks.
        result = imp.find_module(fullname, path)
        if result:
            return result
    except ImportError:
        pass
    if path is None:
        path = sys.path
    for item in path:
        # 2. Scan path for import hooks. sys.path_importer_cache maps
        # path items to optional "importer" objects, that implement
        # find_module() etc.  Note that path must be a subset of
        # sys.path for this to work.
        importer = sys.path_importer_cache.get(item)
        if importer:
            try:
                result = importer.find_module(fullname, [item])
                if result:
                    return result
            except ImportError:
                pass
    raise ImportError("%s not found" % fullname)

if __name__ == "__main__":
    # Provide a simple CLI for `find_module` above.
    import argparse
    parser = argparse.ArgumentParser()
    parser.add_argument("-p", "--path", action="append")
    parser.add_argument("modname", nargs='+')
    args = parser.parse_args()
    for name in args.modname:
        print find_module(name, args.path)

Note, though, that the result from finding a module in a ZIP archive looks quite different to what imp.find_module returns: you'll get a zipimport.zipimporter object for the particular ZIP. The litte program above prints the following when asked to find a regular module, a built-in module and a module from a zipped egg:

$ python find_module.py grin os sys
<zipimporter object "<my venv>/lib/python2.7/site-packages/grin-1.2.1-py2.7.egg">
(<open file '<my venv>/lib/python2.7/os.py', mode 'U' at 0x10a0bbf60>, '<my venv>/lib/python2.7/os.py', ('.py', 'U', 1))
(None, 'sys', ('', '', 6))
fpbhb
  • 1,469
  • 10
  • 22
  • The call of importer.find_module(fullname) does not pass the `path` argument. This causes to return false results, if path is not sys.path – guettli Mar 17 '15 at 09:11
  • I use this now: http://stackoverflow.com/questions/23697497/pip-installs-package-twice – guettli Mar 17 '15 at 09:12
  • I've updated the code. It will now correctly honor `path`. My point was to explain what's going on. – fpbhb Mar 17 '15 at 11:11
  • You pass in `path` to `importer.find_modul()`. But AFAIK the argument `path` gets ignored from zipimporter :-( – guettli Mar 17 '15 at 11:14
  • Yes, I know it is ignored by `zipimporter`. But there could be *other* import hooks installed that use `path`. Insofar my answer addresses your question more generally, as the code will find any module the Python interpreter would find in a similar way the real `import` implementation does (modulo the difficulties posed by the intricate interplay of import.c, PEP 302 etc. in Python 2). – fpbhb Mar 17 '15 at 11:20
  • BTW, if you are worried that `zipimporter.find_module()` ignoring `path` is problematic, because you want the code to honor the path you give it: this is not the case. The `zipimporter` objects in the importer cache are already bound to one specific ZIP archive/egg, and will lookup the module in question there. The protocol introduced with PEP 302 is somewhat inconsistent with the `imp` API in this regard. – fpbhb Mar 17 '15 at 12:03