12

I wrote a C extension (mycext.c) for Python 3.2. The extension relies on constant data stored in a C header (myconst.h). The header file is generated by a Python script. In the same script, I make use of the recently compiled module. The workflow in the Python3 myscript (not shown completely) is as follows:

configure_C_header_constants() 
write_constants_to_C_header() # write myconst.h
os.system('python3 setup.py install --user') # compile mycext
import mycext
mycext.do_stuff()

This works perfectly fine the in a Python session for the first time. If I repeat the procedure in the same session (for example, in two different testcases of a unittest), the first compiled version of mycext is always (re)loaded.

How do I effectively reload a extension module with the latest compiled version?

user1069152
  • 495
  • 5
  • 14
  • It's not exactly constant if you need to change it all the time... Put the constants in a configuration file. – Lennart Regebro Nov 28 '11 at 15:28
  • They will be constant in the real application (it will not use Python). I use Python to generate the constants and unittest the C code. – user1069152 Nov 29 '11 at 08:40
  • Make a config file until you have figured out what the constants should be. – Lennart Regebro Nov 29 '11 at 09:56
  • Thanks for the suggestion. I am testing an algorithm, the constants are application specific (I cannot know them before hand). From my incomplete problem description it is not clear why I cannot do it the way you suggest. The answer provided by Sven does exactly what I want, though. – user1069152 Dec 04 '11 at 22:03
  • Indeed, it is not clear, because there is no reason. You *can* do it that way, I promise. :-) – Lennart Regebro Dec 05 '11 at 10:00

3 Answers3

14

You can reload modules in Python 3.x by using the imp.reload() function. (This function used to be a built-in in Python 2.x. Be sure to read the documentation -- there are a few caveats!)

Python's import mechanism will never dlclose() a shared library. Once loaded, the library will stay until the process terminates.

Your options (sorted by decreasing usefulness):

  1. Move the module import to a subprocess, and call the subprocess again after recompiling, i.e. you have a Python script do_stuff.py that simply does

    import mycext
    mycext.do_stuff()
    

    and you call this script using

    subprocess.call([sys.executable, "do_stuff.py"])
    
  2. Turn the compile-time constants in your header into variables that can be changed from Python, eliminating the need to reload the module.

  3. Manually dlclose() the library after deleting all references to the module (a bit fragile since you don't hold all the references yourself).

  4. Roll your own import mechanism.

    Here is an example how this can be done. I wrote a minimal Python C extension mini.so, only exporting an integer called version.

    >>> import ctypes
    >>> libdl = ctypes.CDLL("libdl.so")
    >>> libdl.dlclose.argtypes = [ctypes.c_void_p]
    >>> so = ctypes.PyDLL("./mini.so")
    >>> so.PyInit_mini.argtypes = []
    >>> so.PyInit_mini.restype = ctypes.py_object 
    >>> mini = so.PyInit_mini()
    >>> mini.version
    1
    >>> del mini
    >>> libdl.dlclose(so._handle)
    0
    >>> del so
    

    At this point, I incremented the version number in mini.c and recompiled.

    >>> so = ctypes.PyDLL("./mini.so")
    >>> so.PyInit_mini.argtypes = []
    >>> so.PyInit_mini.restype = ctypes.py_object 
    >>> mini = so.PyInit_mini()
    >>> mini.version
    2
    

    You can see that the new version of the module is used.

    For reference and experimenting, here's mini.c:

    #include <Python.h>
    
    static struct PyModuleDef minimodule = {
       PyModuleDef_HEAD_INIT, "mini", NULL, -1, NULL
    };
    
    PyMODINIT_FUNC
    PyInit_mini()
    {
        PyObject *m = PyModule_Create(&minimodule);
        PyModule_AddObject(m, "version", PyLong_FromLong(1));
        return m;
    }
    
Sven Marnach
  • 574,206
  • 118
  • 941
  • 841
  • 2
    Thanks, `imp.reload(mypythonmod)` works fine for Python modules, but I am dealing with a C extension module. `imp.reload(mycext)` still reloads the originally imported version of the extension module. – user1069152 Nov 28 '11 at 12:45
  • Could you elaborate a bit more on option 1. I have no experience whatsoever with subprocesses. I tried `subprocess.call(['import', 'mycext'])` and the interpreter stays idle. Tried `subprocess.Popen(['import', 'mycext']`, how do I then call `mycext.do_stuff()`? – user1069152 Nov 29 '11 at 08:48
  • @user1069152: Edited my answer. – Sven Marnach Nov 29 '11 at 12:04
  • 1
    Option 4 works perfectly. Option 1 gets complex when do_stuff requires input and output arguments. – user1069152 Dec 04 '11 at 21:56
  • @user1069152: Well, option 4 is quite a hack and not very portable. To make the more robust option 1 work, you could use the `pickle` module to pipe the arguments to the subprocess and get the return values back. – Sven Marnach Dec 05 '11 at 12:38
0

there is another way, set a new module name, import it, and change reference to it.

yihuang
  • 329
  • 2
  • 7
0

Update: I have now created a Python library around this approach:


Rather than using the subprocess module in Python, you can use multiprocessing. This allows the child process to inherit all of the memory from the parent (on UNIX-systems).

For this reason, you also need to be careful not to import the C extension module into the parent.

If you return a value that depends on the C extension, it might also force the C extension to become imported in the parent as it receives the return-value of the function.

import multiprocessing as mp
import sys


def subprocess_call(fn, *args, **kwargs):
    """Executes a function in a forked subprocess"""
    
    ctx = mp.get_context('fork')
    q = ctx.Queue(1)
    is_error = ctx.Value('b', False)
    
    def target():
        try:
            q.put(fn(*args, **kwargs))
        except BaseException as e:
            is_error.value = True
            q.put(e)
    
    ctx.Process(target=target).start()
    result = q.get()    
    if is_error.value:
        raise result
    
    return result


def my_c_extension_add(x, y):
    assert 'my_c_extension' not in sys.modules.keys()
    # ^ Sanity check, to make sure you didn't import it in the parent process

    import my_c_extension
    return my_c_extension.add(x, y)


print(subprocess_call(my_c_extension_add, 3, 4))

If you want to extract this into a decorator - for a more natural feel, you can do:

class subprocess:
    """Decorate a function to hint that it should be run in a forked subprocess"""
    def __init__(self, fn):
        self.fn = fn
    def __call__(self, *args, **kwargs):
        return subprocess_call(self.fn, *args, **kwargs)


@subprocess
def my_c_extension_add(x, y):
    assert 'my_c_extension' not in sys.modules.keys()
    # ^ Sanity check, to make sure you didn't import it in the parent process

    import my_c_extension
    return my_c_extension.add(x, y)


print(my_c_extension_add(3, 4))

This can be useful if you are working in a Jupyter notebook, and you want to rerun some function without rerunning all your existing cells.

Notes

This answer might only be relevant on Linux/macOS where you have a fork() system call:

Tobias Bergkvist
  • 1,751
  • 16
  • 20