0

While working with the Python C API, I found that the python interpreter crashes when initializing it a second time and executing import numpy after each initilization. Any other command (e.g. import time) will do just fine.

#include <Python.h>

int main(int argc, char ** argv)
{
    while(1){
        printf("Initializing python interpreter...\n");
        Py_Initialize();
        if(PyRun_SimpleString("import numpy")) {
            exit(1);
        }
        printf("Finalizing python interpreter...\n");
        Py_Finalize();
    }
    return 0;
}

The above program crashes on both of my test systems (ubuntu and manjaro, no matter what python version I use) with a Segmentation Fault while executing import numpy a second time.

The Documentation (https://docs.python.org/3/c-api/init.html?#c.Py_FinalizeEx) indeed says that: Some extensions may not work properly if their initialization routine is called more than once; this can happen if an application calls Py_Initialize() and Py_FinalizeEx() more than once.

But shouldn't there be a way to properly clear the memory of the interpreter so it can be initialized multiple times? For example if I have a program that allows the user to run a custom python script, it should be possible to run the same script multiple times without restarting the program. Any clues?

  • What about using [Py_NewInterpreter](https://docs.python.org/3/c-api/init.html?#c.Py_NewInterpreter) – dvhh Dec 13 '19 at 01:38
  • It looks like this has been pretty widely reported before https://stackoverflow.com/questions/7676314/py-initialize-py-finalize-not-working-twice-with-numpy https://stackoverflow.com/questions/14843408/python-c-embedded-segmentation-fault https://stackoverflow.com/questions/16779799/py-initialize-and-py-finalize-and-matplotlib – DavidW Dec 13 '19 at 08:22

1 Answers1

1

You can't just declare that all memory should be cleared/reset; Python isn't in control of all of it. Extension modules are actual shared object/DLL files, that can do literally anything a plain C program can do, and they're not required to register all of their actions in such a way that the core Python interpreter knows how to undo them on finalization. Python can't know that this part of the SO/DLL memory stores data that must be cleared, while that part is static data that should be left alone.

It's perfectly possible to run a script multiple times, but you don't do it by finalizing and reinitializing, you just actually run the script multiple times in a single initialization (and hope it's written in an idempotent fashion).

As an alternative, if you're on a UNIX-like box (read: anything but Windows), "resetting" can be done a different way, by execing your program to restamp it with a fresh run. This resets more thoroughly than Python itself can, though even then it's not foolproof; e.g. if file descriptors aren't opened in O_CLOEXEC mode, they'll stay open in the newly exec-ed process.

ShadowRanger
  • 143,180
  • 12
  • 188
  • 271