1

This is happening in Python 3.4.3 from non-python created threads:

https://docs.python.org/3.4/c-api/init.html#non-python-created-threads

There are a couple question on SO that take a look at a similar issue from within C/C++ applications.

AssertionError (3.X only) when calling Py_Finalize with threads

Fatal error during Py_Finalize in embedded Python application

PyEval_InitThreads in Python 3: How/when to call it? (the saga continues ad nauseum)

But none of those specifically deal with Grand Central Dispatch. I don't know if it matters or not as it's likely all just threads under the hood.

Attempting to apply the knowledge from those posts however, still causes me issues.

So here is where I am at currently, I have an obj-c class that represents my Python Runtime, and I have the following relevant methods:

- (void)initialize
{
    Py_Initialize();
    PyEval_InitThreads();

    PyObject* sysPath = PySys_GetObject((char*)"path");

    for(NSString * path in self.pythonPath){
        PyList_Append(sysPath, objc_convert_string(path));
    }

    // not calling PyEval_SaveThread 
    // causes beginTask below to die on 
    // PyEval_AcquireThread

    // self.threadState = PyThreadState_Get();
    // release the GIL, this shouldn't need to be
    // done, as I understand it, 
    // but as the comment above states, if I don't
    // beginTask will fail at PyEval_AcquireThread
    self.threadState = PyEval_SaveThread();
    self.running = YES;
}

That is how I initialize Python. I then invoke python commands via:

- (void)beginTask:(nonnull void (^)(void))task completion:(nullable void (^)(void))completion
{
    dispatch_async(self.pythonQueue, ^{

        PyInterpreterState * mainInterpreterState = self.threadState->interp;
        PyThreadState * myThreadState = PyThreadState_New(mainInterpreterState);
        PyEval_AcquireThread(myThreadState);

        // Perform any Py_* related functions here
        task();

        PyEval_ReleaseThread(PyThreadState_Get());

        if (completion){
            dispatch_async(dispatch_get_main_queue(), ^{
                completion();
            });
        }

    });
}

In the case of doing a long running operation, I have some code that does quite of bit of Jinja template rendering an saving to the filesystem. I want to clean up after that is done, so I call Py_Finalize and then go and reinitialize, using the above method enter my issue:

- (void)finalize
{
    PyEval_RestoreThread(self.threadState);

    // Problems here
    Py_Finalize();

    self.running = NO;
}

This causes the following error in Python:

Exception ignored in: <module 'threading' from '/Users/blitz6/Library/Developer/Xcode/DerivedData/Live-cblwxzmbsdpraeebpregeuhikwzh/Build/Products/Debug/LivePython.framework/Resources/python3.4/lib/python3.4/threading.py'>
Traceback (most recent call last):
  File "/Users/blitz6/Library/Developer/Xcode/DerivedData/Live-cblwxzmbsdpraeebpregeuhikwzh/Build/Products/Debug/LivePython.framework/Resources/python3.4/lib/python3.4/threading.py", line 1296, in _shutdown
    _main_thread._delete()
  File "/Users/blitz6/Library/Developer/Xcode/DerivedData/Live-cblwxzmbsdpraeebpregeuhikwzh/Build/Products/Debug/LivePython.framework/Resources/python3.4/lib/python3.4/threading.py", line 1019, in _delete
    del _active[get_ident()]
KeyError: 140735266947840

I have tried to deal with this a few different ways including, using PyGILState_Ensure() and PyGILState_Release(myState); in beginTask:

- (void)beginTask:(nonnull void (^)(void))task completion:(nullable void (^)(void))completion
{

    dispatch_async(self.pythonQueue, ^{

        PyGILState_STATE state = PyGILState_Ensure();

        task();

        PyGILState_Release(state);

        if (completion){
            dispatch_async(dispatch_get_main_queue(), ^{
                completion();
            });
        }
    });
}

But that will cause this error in the finalize method above:

Exception ignored in: <module 'threading' from '/Users/blitz6/Library/Developer/Xcode/DerivedData/Live-cblwxzmbsdpraeebpregeuhikwzh/Build/Products/Debug/LivePython.framework/Resources/python3.4/lib/python3.4/threading.py'>
Traceback (most recent call last):
  File "/Users/blitz6/Library/Developer/Xcode/DerivedData/Live-cblwxzmbsdpraeebpregeuhikwzh/Build/Products/Debug/LivePython.framework/Resources/python3.4/lib/python3.4/threading.py", line 1289, in _shutdown
    assert tlock.locked()
AssertionError: 

So I'm pretty much stuck with: How do I Py_Finalize without getting some kind of error. Clearly I'm not understanding something. I can also confirm, via xcode when I hit my breakpoints, that my dispatch_async block is being run from another thread that is not the main thread.

Update

Tinkering a bit, I discovered that, this:

PyObject* module = PyImport_ImportModule("requests");

while I am on another thread, will cause this error when I Py_Finalize

Exception ignored in: <module 'threading' from '/Users/blitz6/Library/Developer/Xcode/DerivedData/Live-cblwxzmbsdpraeebpregeuhikwzh/Build/Products/Debug/LivePython.framework/Resources/python3.4/lib/python3.4/threading.py'>
Traceback (most recent call last):
  File "/Users/blitz6/Library/Developer/Xcode/DerivedData/Live-cblwxzmbsdpraeebpregeuhikwzh/Build/Products/Debug/LivePython.framework/Resources/python3.4/lib/python3.4/threading.py", line 1296, in _shutdown
    _main_thread._delete()
  File "/Users/blitz6/Library/Developer/Xcode/DerivedData/Live-cblwxzmbsdpraeebpregeuhikwzh/Build/Products/Debug/LivePython.framework/Resources/python3.4/lib/python3.4/threading.py", line 1019, in _delete
    del _active[get_ident()]
KeyError: 140735266947840

If I import:

PyObject* module = PyImport_ImportModule("os");
OR
PyObject* module = PyImport_ImportModule("json");

everything runs fine as examples. When I start importing my own modules, is where I run into the problems.

Inside Py_Finalize, wait_for_thread_shutdown(); is where I run into this issue. I guess, according to the comments, it's related to:

/* Wait until threading._shutdown completes, provided the threading module was imported in the first place. The shutdown routine will wait until all non-daemon "threading" threads have completed. */

Specifically in wait_for_thread_shutdown:

PyThreadState *tstate = PyThreadState_GET();
PyObject *threading = PyMapping_GetItemString(tstate->interp->modules,
                                              "threading");
if (threading == NULL) {
    /* threading not imported */
    PyErr_Clear();
    return;
}

tstate comes back NULL but threading is not NULL which skips the the PyErr_Clear() code path and executes:

result = _PyObject_CallMethodId(threading, &PyId__shutdown, "");
if (result == NULL) {
    PyErr_WriteUnraisable(threading);
}
else {
    Py_DECREF(result);
}
Py_DECREF(threading);

If I just:

PyObject* module = PyImport_ImportModule("json");

Then, PyErr_WriteUnraisable(threading); is executed in wait_for_thread_shutdown

result = _PyObject_CallMethodId(threading, &PyId__shutdown, "");
if (result == NULL) {
    PyErr_WriteUnraisable(threading);
}
Community
  • 1
  • 1
AJ Venturella
  • 4,742
  • 4
  • 33
  • 62

2 Answers2

1

As mentioned previously, I was thinking about my code and I was thinking about the embedded Python Interpreter incorrectly. It had nothing to do with GCD and everything to do with my misunderstanding of some facets of embedded Python. I may still have misunderstandings, but the logic below matches my expectations of what I thought it should be doing, and my error is gone.

What I was attempting to do was run one-off tasks, not keep anything around. Thinking I needed threading here is what tripped me up.

What I was doing, was acquiring the GIL and then importing some python code that used the threading module. When you do that, the interpreter registers that you have brought in the threading module, as when you Py_Finalize it jumps though some hoops to ensure all of the child threads you that may or may not be present are shut down. I was effectively pulling the rug out from under it, which was causing my error. Instead, the work I needed done was more conducive to Py_NewInterpreter. It runs the exact same threading shutdown procedure as Py_Finalize when you call Py_EndInterpreter but is isolated to itself.

So my final GCD one-and-done code looks as follows:

Initialization

- (void)initialize
{
    Py_Initialize();
    PyEval_InitThreads();

    [self updateSysPath];

    // Release the GIL
    self.threadState = PyEval_SaveThread();
    self.running = YES;

}

Task Execution

- (void)beginTask:(nonnull void (^)(Runtime * __nonnull))task completion:(nullable void (^)(Runtime * __nonnull))completion
{
    dispatch_async(self.pythonQueue, ^{

        PyInterpreterState * mainInterpreterState = self.threadState->interp;
        PyThreadState * taskState = PyThreadState_New(mainInterpreterState);

        // Acquire the GIL
        PyEval_AcquireThread(taskState);

        PyThreadState* tstate = Py_NewInterpreter();
        [self updateSysPath];

        task(self);

        // when Py_EndInterpreter is called, the current 
        // thread state is set to NULL, 
        // so we need to put ourselves back on
        // the taskState, and release the GIL
        Py_EndInterpreter(tstate);
        PyThreadState_Swap(taskState);

        // release the GIL
        PyEval_ReleaseThread(taskState);

        // cleanup
        PyThreadState_Clear(taskState);
        PyThreadState_Delete(taskState);

        if (completion){
            dispatch_async(dispatch_get_main_queue(), ^{
                completion(self);
            });
        }
    });
}

Finalization

- (void)finalize
{
    // acquire the GIL
    PyEval_RestoreThread(self.threadState);
    Py_Finalize();
    self.running = NO;
}
AJ Venturella
  • 4,742
  • 4
  • 33
  • 62
0

You said:

But none of those specifically deal with Grand Central Dispatch. I don't know if it matters or not as it's likely all just threads under the hood

Yes, ultimately, GCD blocks must execute on one OS thread or another, but GCD queues are not an abstraction on top of a single thread, and (with the exception of the main queue) make no promises about thread affinity at all. In other words, there's no way to guarantee that submitted blocks are executed on the same thread or any specific thread, nor is it wise to make any assumptions about the thread they end up executing on.

The safest, most straightforward way to go about this will likely be for you to explicitly manage the OS threads (either via NSThread or the pthreads API) that you plan to use for running Python code. The thread management inside GCD is, at the end of the day, a private implementation detail, meaning that even if you manage to get something that works now, Apple could make some subtle change to the thread management in GCD and break your app in the future.

In sum, GCD queues probably aren't the right tool for this job.

I recently posted another answer describing a way to manage a private thread for wrapping a third-party library that demanded thread affinity. It may be of some interest to you.

Community
  • 1
  • 1
ipmcc
  • 29,581
  • 5
  • 84
  • 147
  • This is a nice solution, and I learned something! Thank you!! After doing some more digging however, I discovered that my issue actually arose from thinking about my code I needed to execute AND the Python Interpreter incorrectly. In my case, I'm performing 1-off tasks, so I don't need to worry about thread affinity. The task runs inside the GCD thread, finishes and it's done forever. – AJ Venturella May 14 '15 at 13:30