19

I'm embedding the python interpreter in a multithreaded C application and I'm a little confused as to what APIs I should use to ensure thread safety.

From what I gathered, when embedding python it is up to the embedder to take care of the GIL lock before calling any other Python C API call. This is done with these functions:

gstate = PyGILState_Ensure();
// do some python api calls, run python scripts
PyGILState_Release(gstate);

But this alone doesn't seem to be enough. I still got random crashes since it doesn't seem to provide mutual exclusion for the Python APIs.

After reading some more docs I also added:

PyEval_InitThreads();

right after the call to Py_IsInitialized() but that's where the confusing part comes. The docs state that this function:

Initialize and acquire the global interpreter lock

This suggests that when this function returns, the GIL is supposed to be locked and should be unlocked somehow. but in practice this doesn't seem to be required. With this line in place my multithreaded worked perfectly and mutual exclusion was maintained by the PyGILState_Ensure/Release functions.
When I tried adding PyEval_ReleaseLock() after PyEval_ReleaseLock() the app dead-locked pretty quickly in a subsequent call to PyImport_ExecCodeModule().

So what am I missing here?

Amro
  • 123,847
  • 25
  • 243
  • 454
shoosh
  • 76,898
  • 55
  • 205
  • 325

4 Answers4

9

I had exactly the same problem and it is now solved by using PyEval_SaveThread() immediately after PyEval_InitThreads(), as you suggest above. However, my actual problem was that I used PyEval_InitThreads() after PyInitialise() which then caused PyGILState_Ensure() to block when called from different, subsequent native threads. In summary, this is what I do now:

  1. There is global variable:

    static int gil_init = 0; 
    
  2. From a main thread load the native C extension and start the Python interpreter:

    Py_Initialize() 
    
  3. From multiple other threads my app concurrently makes a lot of calls into the Python/C API:

    if (!gil_init) {
        gil_init = 1;
        PyEval_InitThreads();
        PyEval_SaveThread();
    }
    state = PyGILState_Ensure();
    // Call Python/C API functions...    
    PyGILState_Release(state);
    
  4. From the main thread stop the Python interpreter

    Py_Finalize()
    

All other solutions I've tried either caused random Python sigfaults or deadlock/blocking using PyGILState_Ensure().

The Python documentation really should be more clear on this and at least provide an example for both the embedding and extension use cases.

forman
  • 91
  • 1
  • 4
4

Eventually I figured it out.
After

PyEval_InitThreads();

You need to call

PyEval_SaveThread();

While properly release the GIL for the main thread.

shoosh
  • 76,898
  • 55
  • 205
  • 325
  • This is wrong and potentially harmful: `PyEval_SaveThread` should always be in conjunction with `PyEval_RestoreThread`. As [explained elsewhere](http://stackoverflow.com/a/15471525/1600898), you shouldn't try to release the lock after initializing it; just leave it to Python to release it as part of its regular work. – user4815162342 Mar 19 '13 at 10:22
  • I don't see why is it harmful if you put all the calls to python in a _Block_ _Allow_ blocks. On the other hand, if you don't call `PyEval_SaveThread();` then your main thread will block the access of other threads to Python. In other words `PyGILState_Ensure()` deadlocks. – khkarens Apr 26 '13 at 11:37
  • This is the only thing that works for both embedding Python and calling into an extension module. – Kevin Smyth Aug 15 '16 at 19:14
  • 1
    Indeed, `PyEval_SaveThread()` must be called by the thread that called `PyEval_InitThreads()`, or a deadlock will happen when a thread tries to call `PyGILState_Ensure()` (since the GIL is not available for retrieval). `PyEval_RestoreThread()` should eventually be called by the same thread that called `PyEval_SaveThread()`, but at that point, it is important that all threads that may call `PyGILState_Ensure()` have finished, or a deadlock may happen, for just the same reason. – andreasdr Jan 04 '18 at 14:50
1

Note that the if (!gil_init) { code in @forman's answer runs only once, so it can be just as well done in the main thread, which allows us to drop the flag (gil_init would properly have to be atomic or otherwise synchronized).

PyEval_InitThreads() is meaningful only in CPython 3.6 and older, and has been deprecated in CPython 3.9, so it has to be guarded with a macro.

Given all this, what I am currently using is the following:

In the main thread, run all of

Py_Initialize();
PyEval_InitThreads(); // only on Python 3.6 or older!
/* tstate = */ PyEval_SaveThread(); // maybe save the return value if you need it later

Now, whenever you need to call into Python, do

state = PyGILState_Ensure();
// Call Python/C API functions...    
PyGILState_Release(state);

Finally, from the main thread, stop the Python interpreter

PyGILState_Ensure(); // PyEval_RestoreThread(tstate); seems to work just as well
Py_Finalize()
user7610
  • 25,267
  • 15
  • 124
  • 150
-2

Having a multi-threaded C app trying to communicate from multiple threads to multiple Python threads of a single CPython instance looks risky to me.

As long as only one C thread communicates with Python you should not have to worry about locking even if the Python application is multi-threading. If you need multiple python threads you can set the application up this way and have multiple C threads communicate via a queue with that single C thread that farms them out to multiple Python threads.

An alternative that might work for you is to have multiple CPython instances one for each C thread that needs it (of course communication between Python programs should be via the C program).

Another alternative might the Stackless Python interpreter. That does away with the GIL, but I am not sure you run into other problems binding it to multiple threads. stackless was a drop-in replacement for my (single-threaded) C application.

Anthon
  • 69,918
  • 32
  • 186
  • 246