2

I'm trying to debug an issue that causes sys.modules['numpy'] to get overwritten. I've added some print statements to numpy.__init__, and when I try to import numpy, I get this output:

numpy.__init__ running
id(sys.modules) = 89034704
id(sys.modules['numpy']) = 161528304
numpy.__init__ running
id(sys.modules) = 89034704
id(sys.modules['numpy']) = 177135864

Numpy has a number of circular imports, which should work as described in this answer. But in my case, instead of getting the partially initialized numpy module from sys.modules, numpy gets imported again, and numpy.__init__ gets executed a second time, leading to a crash.

How can I instrument sys.modules to get some visibility into who is overwriting sys.modules['numpy'] and when? Normally I would write a dict subclass, but I don't think it's safe to change sys.modules to point to my own object. I tried overriding sys.modules.__setattr__, but that's a read-only attribute.

Context: I'm trying to debug this issue in PyCall, a Julia library. PyCall embeds a Python interpreter in a running Julia process, and delegates the import to PyImport_ImportModule from cpython. The problem above happens inside a single call to PyImport_ImportModule, so I hope this question should be answerable with knowledge of python / cpython, but without knowledge of Julia / PyCall.

Community
  • 1
  • 1
cberzan
  • 2,009
  • 1
  • 21
  • 32
  • I would first try to reproduce the error only in Python, without involving Julia (or, ideally, direct C API calls) at all. Do you have example Python code that can do that? – BrenBarn Jan 07 '15 at 18:57
  • 1
    The answer to [this question](http://stackoverflow.com/questions/14778407/do-something-every-time-a-module-is-imported) suggests overriding `__import__` might work. I suspect the problem is due to some sort of path mixup. – BrenBarn Jan 07 '15 at 19:17

2 Answers2

2

You can change sys.modules from a plain dict to one that prints out assignments, e.g:

import sys
import traceback

class noisydict(dict):
    def __setitem__(self, key, value):
        print('ASSIGNED: key={!r} value={!r} at:'.format(key, value))
        traceback.print_stack()
        return dict.__setitem__(self, key, value)

sys.modules = noisydict(sys.modules)

This may or may not work if the overwriting happens in C code (such code may directly access the underlying dict.__setitem__ rather than just do a sys.modules[name] = newmodule as Python code would) but it's worth a try!

Alex Martelli
  • 854,459
  • 170
  • 1,222
  • 1,395
  • Unfortunately this does not help; I tried putting your code in a file `sysmoduleshack.py`. Then from the interpreter: `import sysmoduleshack; import numpy` prints nothing. I suspect you're right that the dict is modified from C code. – cberzan Jan 07 '15 at 19:02
  • @cberzan, then your life gets harder -- and 100% dependent on the exact version of Python, which you don't mention (the importing system changes often). At least try `python -v` to tell you exactly what's being imported up to the time of the crash you observe -- again, it's a start. – Alex Martelli Jan 07 '15 at 19:10
  • 2
    You forgot to `print` the "ASSIGNED" string you created. But I think the `import` mechanism that modifies `sys.modules` does so at the C level, so I don't think this will work. In fact, when I try it, it seems to prevent imported modules from getting into the new `sys.modules` at all. – BrenBarn Jan 07 '15 at 19:12
  • Tx @BrenBarn for spotting my `thinko`, edited now to fix that. – Alex Martelli Jan 07 '15 at 19:18
1

Thanks to @BrenBarn for pointing me to https://stackoverflow.com/a/14778568/744071. The following worked for my purposes:

importhack.py:

import traceback

old_import = __import__

def my_import(module, *args, **kwargs):
    print "my_import({}) caused by:".format(module)
    traceback.print_stack()
    return old_import(module, *args, **kwargs)

__builtins__['__import__'] = my_import

Usage:

>>> import importhack
>>> import numpy

I believe the original problem in PyCall.jl was caused by calling PyImport_ImportModule before the Python interpreter was fully initialized.

Community
  • 1
  • 1
cberzan
  • 2,009
  • 1
  • 21
  • 32