My setup:
- Given some Legacy™ C++ code
- Using wrappers that use the Python ↔ C/C++ API, this is compiled into a
.pyd
file - There is a bunch of Legacy™ Python code that uses this
.pyd
file - This Python code is then called from MATLAB
- MATLAB R2018a / Python 3.6 / C++11 / Windows 10 / MSVC 2017 Community
- This setup is rigid, i.e., all this code is used by 100s of people in all sorts of different contexts; this non-ideal setup is already the best possible trade-off
The problem:
- MATLAB crashes due to an Access Violation® somewhere in the C++ code.
Obviously, MATLAB can't "look through" the .pyd
binary to determine the root cause, so this is all I have to go on.
What I've tried:
- Using MSVC2017, build the
.pyd
inDebug
mode (setup.py build --debug
). - In MATLAB:
pyversion 'c:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\python.exe'
- Be really annoyed by having to restart MATLAB before you can actually do that.
- After restarting MATLAB and running said command, in MSVC:
Debug
→Attach to Process
→ selectMATLAB.exe
- Run the MATLAB code that causes the crash.
- MATLAB/Python complains:
Python Error: ImportError: DLL load failed: The specified module could not be found.
- Using MSVC2017, build the
- Try the same as in 1., this time renaming the
my_pylib.pyd
file tomy_pylib_d.pyd
(as found here of all places...) - MATLAB/Python complains:
Python Error: ImportError: cannot import name 'my_pylib'
- Try the same as in 1., this time renaming the
- Try the same as 2., this time stating
pyversion 'c:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\python_d.exe'
(first making sure that a Python3.6 debugging environment has been installed in the MSVC context). - Be REALLY annoyed by having to restart MATLAB Every. Fucken. Time. you run that command!
- Go to The MathWorks Support site and put in a Feature Request to try and get that fixed in R2019a
- Restart MATLAB, re-attach MSVC to
MATLAB.exe
and repeat from the top - MATLAB/Python still complains:
Python Error: ImportError: cannot import name 'my_pylib'
- Try the same as 2., this time stating
- Repeat 3., this time forcing MATLAB to also use
python36_d.dll
because somehow that mechanism seems broken in MATLAB. - SCREAM IN RAGE because you have to restart MATLAB AGAIN and not forget to re-attach to the
MATLAB.exe
process in MSVC AGAIN. - This time it at least continues with Python code execution, but it trips on every imported module (
numpy
,scipy
, etc.) with the sameImportError
as before ...
- Repeat 3., this time forcing MATLAB to also use
- Give up on Python3.6.
- Try again, from the top, using a fresh, non-MSVC installation of Python 3.7, including debugging symbols etc.
- in MATLAB:
pyversion 'c:\wherever\Python37_64\python.exe'
<speak_angrily_through_teeth>
Oh. Yeah. I forgot. Restart MATLAB AGAIN, runpyversion
command above AGAIN, run offending MATLAB code, SCREAM WHILE PULLING OUT HAIR because you forgot to re-attach MSVC, restart MATLAB AGAIN because it obviously crashed, re-attach MSVC, run offending MATLAB code</speak_angrily_through_teeth>
Success! MSVC goes to my API code after MATLAB triggers a breakpoint. The breakpoint is triggered here:
PyObject *module = PyModule_Create(&moduledef);
where
static struct PyModuleDef moduledef = { PyModuleDef_HEAD_INIT, "my_pylib", NULL, sizeof(struct module_state), my_pylib_methods, NULL, my_pylib_traverse, my_pylib_clear, NULL };
After downloading the Python 3.7 source code, I can dig a bit deeper. The
PyModule_Create
call is a wrapper that calls the following function inObjects/moduleobject.c
:PyObject * PyModule_Create2(struct PyModuleDef* module, int module_api_version) { if (!_PyImport_IsInitialized(PyThreadState_GET()->interp)) Py_FatalError("Python import machinery not initialized"); return _PyModule_CreateInitialized(module, module_api_version); }
where the breakpoint is inside the
if()
clause. This means that themoduledef
doesn't even matter._PyImport_IsInitialized()
is a function inPython/import.c
:int _PyImport_IsInitialized(PyInterpreterState *interp) { if (interp->modules == NULL) return 0; return 1; }
which doesn't seem like a very likely candidate to be causing my Access Violation®. Going into the
PyThreadState_GET()
finally made me realize: I'm actually debugging the Python/C++ API instead of my code...
Questions:
- What the heck!! Why is debugging in MATLAB using Python3.6 so damn difficult?! What is the "proper" way to do it? I can't find much in the (online) documentation for it...
- Is there a known issue with the Python 3.7 C/C++ API that could be causing this?
- Am I doing something wrong / stupidly? Any tips/pointers on how I can find the problem in my C++ code more effectively?