0

I'm trying to dynamically import Python modules that match a pattern at runtime. The modules are located inside a Python package.

The function I use to find the modules is:

def load_modules_from_dir(dirname, pattern=""):
    modules = []
    for importer, package_name, _ in pkgutil.iter_modules([dirname]):
        if re.search(pattern, package_name):
            full_package_name = '%s.%s' % (dirname, package_name)
            if full_package_name not in sys.modules:
                module = importer.find_module(package_name).load_module(full_package_name)
                modules.append(module)
    return modules

I call it as follows:

module_dir = os.path.join(os.path.dirname(__file__))
modules = utils.load_modules_from_dir(module_dir, "jscal$")

On Linux it finds all modules, but on Windows it doesn't find any modules at all. If I print dirname in function load_modules_from_dir, I get: H:\temp\linx\dist\calibrate\dcljscal

I've reproduced it in Python shell on Windows and nailed it down to the path delimiter. The following finds nothing:

>>> for x in pkgutil.iter_modules(['H:\temp\linx\dist\calibrate\dcljscal']):print x
...
>>> 

If I replace the Windows path separator with the Linux one, it works:

>>> for x in pkgutil.iter_modules(['H:/temp/linx/dist/calibrate/dcljscal']):print x
...
(<pkgutil.ImpImporter instance at 0x00AD9C60>, 'demojscal', True)
(<pkgutil.ImpImporter instance at 0x00AD9C60>, 'dx2cremjscal', True)
(<pkgutil.ImpImporter instance at 0x00AD9C60>, 'linxjscal', True)
>>>

It also works if I replace \ with \\, basically escaping the Windows path separator:

>>> for x in pkgutil.iter_modules(['H:\\temp\\linx\\dist\\calibrate\\dcljscal']):print x
...
(<pkgutil.ImpImporter instance at 0x00AD9E68>, 'demojscal', True)
(<pkgutil.ImpImporter instance at 0x00AD9E68>, 'dx2cremjscal', True)
(<pkgutil.ImpImporter instance at 0x00AD9E68>, 'linxjscal', True)
>>>

It seems that the path generated by os.path.join(os.path.dirname(__file__)) is not portable.
I would expect that os.path.join() would give me a proper path that I can use unaltered in pkgutil.iter_modules(). What am I doing wrong here?

I'm using Python 2.7.11 on Windows XP.

NZD
  • 1,780
  • 2
  • 20
  • 29
  • Staring at `'\t'` doesn't raise any alarms for you? – Eryk Sun Feb 15 '16 at 00:06
  • @eryksun Yes, you are right. If I replace `\t` with `\\t`, it also works in Python shell. But you set me thinking. The path returned by `os.path.join()` must already contain the escaped version of the path. When I print parameter `dirname` in function `load_modules_from_dir`, I get: `H:\temp\linx\dist\calibrate\dcljscal` and not `H: emp\linx\dist\calibrate\dcljscal`. The question now is why does it not work in `pkgutil.iter_modules()` ? – NZD Feb 15 '16 at 01:31
  • `os.path.join(os.path.dirname(__file__))` makes no sense. Use `os.path.abspath(os.path.dirname(__file__))`. – Eryk Sun Feb 15 '16 at 02:02
  • @eryksun Both `os.path.join()` and `os.path.abspath()` give the same result on my system. Thanks for your help. You pointed me in the right direction with the `\t`. – NZD Feb 15 '16 at 19:22
  • Yes, but joining a path of one string to create an absolute path is not the intended effect of `os.path.join`. Use the function that's obviously intended for what you want to do. Don't rely on a side effect of an unrelated function. – Eryk Sun Feb 16 '16 at 04:13

1 Answers1

0

The short answer is: There is no answer to this problem. Strings representing path names can always contain special characters that can bite you at any time.

There are a lot of SE post that try to give a solution. Most of them fix a special case and don't work in general. There is no consensus about what the best approach is. See:
- Unpredictable results from os.path.join in windows
- Python windows path slash
- Path Separator in Python 3
- Windows path in python
- mixed slashes with os.path.join on windows
- Why not os.path.join use os.path.sep or os.sep?

This blog post is also worth mentioning because the writer uses a slightly different approach to bring his point across: backslashes-in-windows-filenames

A pragmatic way to handle path names is to replace the Windows path delimiter with the Linux one: path_name = path_name.replace('\\', '/')
Python on Windows XP and later versions of Windows have no problem with C:/tab/newline/return/012/x012.

Only when you have to print or record the path, you can use os.path.normpath() to convert the path to the representation of the OS or platform you are running Python on.

Community
  • 1
  • 1
NZD
  • 1,780
  • 2
  • 20
  • 29
  • It would be a very bad bug if `os.path.abspath(os.path.dirname(__file__))` yielded a filepath that somehow reinterpreted `\t` as a tab. Generally only the *compiler* does that for *string literals*. – Eryk Sun Feb 16 '16 at 04:19
  • You can't always use slash in place of backslash. An extended path prefixed by ``\\?\`` requires backslash. This prefix bypasses the normal path processing in order to exceed the 260 character path limit. All Windows does is replace ``\\?\`` with the NT DOS devices prefix ``\??\`` before making the system call, and in the NT kernel namespace, *only backslash* is a path delimiter. Also, if you're creating a commandline to run another program with subprocess, the program may treat slashes as option switches. – Eryk Sun Feb 16 '16 at 04:26