1

I'm trying to track down the cause of a bug: https://github.com/numba/numba/issues/3027

It seems that (for some users of numba, but not all)

import sys
import numba

@numba.njit
def some_func(begin1, end1, begin2, end2):
  if begin1 > begin2: return some_func(begin2, end2, begin1, end1)
  return end1 + 1 >= begin2

sys.stdout = sys.stderr
x = id(sys.stdout)
some_func(0,1,2,3)
y = id(sys.stdout)
assert x==y # Fail

the value of sys.stdout differs before and after the call to somefunc. I'd like to know whether this is because:

  • reload(sys) was called, or
  • sys.stdout was reassigned

It seems difficult to know because if reload was called, variables assigned to a module namespace survive the reload, except if they're reinitialized by the module itself:

import sys
sys.stdout = None
sys.zzz = 123
sys = reload(sys)
sys.stderr.write("sys.stdout = {}\n".format(sys.stdout)) # Reset to file object
sys.stderr.write("sys.zzz = {}\n".format(sys.zzz)) # Surprise success!
sys.stderr.flush()
user48956
  • 14,850
  • 19
  • 93
  • 154
  • 1
    Try screwing with something else in `sys` and see if that gets reset too. – user2357112 Jun 15 '18 at 17:22
  • 1
    `reload()` will 'copy across' the globals from the module loaded into a new module object to the existing one, thus additional globals remain. – Martijn Pieters Jun 15 '18 at 17:23
  • Doh. Good idea. – user48956 Jun 15 '18 at 17:23
  • 1
    Oh, and check for [`sys.setdefaultencoding`](https://docs.python.org/2/library/sys.html#sys.setdefaultencoding). It's deleted from `sys` by the `site` module during interpreter startup, and reloading `sys` restores it. I believe the most common reason to reload `sys` is to restore that function. – user2357112 Jun 15 '18 at 17:27
  • 1
    See https://docs.python.org/2/library/functions.html#reload for the details on what `reload()` does. – Martijn Pieters Jun 15 '18 at 17:29
  • @MartijnPieters. Thanks. I know what sys is doing (I explained the question with the second example) - I want to when its doing it. – user48956 Jun 15 '18 at 17:31
  • 1
    @user48956: I've added a method in my answer that'll let you find out! – Martijn Pieters Jun 15 '18 at 17:39
  • @user2357112 , assigning to sys.api_version was an easy was to test. In my example, it is not reset, whereas sys.stdout is. – user48956 Jun 15 '18 at 17:42

1 Answers1

2

Although highly frowned upon, some Python 2 code reloads sys to restore the sys.setdefaultencoding() function. This is almost always the cause of this problem.

So you could detect that sys was reloaded by checking for the setdefaultencoding attribute:

if hasattr(sys, 'setdefaultencoding'):
    # sys was reloaded!

This would only work on Python 2. Or you could augment the sys.flags struct sequence with an extra field:

from collections import namedtuple
import sys, re

_sys_flags_fields = re.findall('(\w+)=\d', repr(sys.flags))
_sys_flags_augmented = namedtuple('flags', _sys_flags_fields + ['sys_not_reloaded'])
sys.flags = _sys_flags_augmented(*sys.flags + (1,))

after which you can test with:

if not getattr(sys.flags, 'sys_not_reloaded', 0):

Augmenting sys.flags is safer than most other sys manipulations, as third-party code might be relying on the documented sys attributes and methods to be untampered with, and it also works on Python 3.

You could prevent sys from being reloaded by wrapping __builtin__.reload / importlib.reload / imp.reload:

try:
    # Python 2
    import __builtin__ as targetmodule
except ImportError:
    # Python 3.4 and up
    try:
        import importlib as targetmodule
        targetmodule.reload   # NameError for older Python 3 releases
    except (ImportError, AttributeError):
        # Python 3.0 - 3.3
        import imp as targetmodule

from functools import wraps

def reload_wrapper(f):
    @wraps(f)
    def wrapper(module):
        if getattr(module, '__name__', None) == 'sys':
            raise ValueError('sys should never be reloaded!')
        return f(module)
    return wrapper

targetmodule.reload = reload_wrapper(targetmodule.reload)

Instead of raising an exception, you could just use the warnings module or some other mechanism to record or make noise about the fact that sys is being reloaded; you probably want to include the caller into such warnings.

Execute the above module as early as possible to ensure that you can catch out the code that is doing this, possibly by inserting it into the sitecustomize module, or by triggering it from a .pth file installed into the site-packages directory. Any line in a .pth file that starts with import is executed as Python code by the site.py module at Python startup, so the following contents in such a file:

import yourpackage.sysreload_neutraliser

would inject an import at Python startup time.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Thanks -- that's a pretty comprehensive solution and handy to prevent reload (why is useful). For people who come here look simply looking to know if sys has been reload, assigning to sys.api_version, and testing whether it reverts back is a simpler way, – user48956 Jun 15 '18 at 17:44
  • @user48956: other code could still be relying on `sys.api_version` being accurate, however. – Martijn Pieters Jun 15 '18 at 17:45
  • Since you can copy it’s value, assign to it, and check it again later, you can know if reload happened (assuming no one else is writing to sys.apiversion) – user48956 Jun 15 '18 at 17:49
  • 1
    @user48956: Yes, I know, but code could actually *read the value and alter behaviour* based on the value. If your code has altered the value, then you broke someone else's library. – Martijn Pieters Jun 15 '18 at 17:50
  • True. Also. Didn't know you could reassign builtins. +1 for that, – user48956 Jun 15 '18 at 18:01
  • 1
    @user48956: I've added another option to detect reloading, where `sys.flags` is extended. – Martijn Pieters Jun 15 '18 at 18:16