2

My setup:

  • Given some Legacy™ C++ code
  • Using wrappers that use the Python ↔ C/C++ API, this is compiled into a .pyd file
  • There is a bunch of Legacy™ Python code that uses this .pyd file
  • This Python code is then called from MATLAB
  • MATLAB R2018a / Python 3.6 / C++11 / Windows 10 / MSVC 2017 Community
  • This setup is rigid, i.e., all this code is used by 100s of people in all sorts of different contexts; this non-ideal setup is already the best possible trade-off

The problem:

  • MATLAB crashes due to an Access Violation® somewhere in the C++ code.

Obviously, MATLAB can't "look through" the .pyd binary to determine the root cause, so this is all I have to go on.

What I've tried:

    • Using MSVC2017, build the .pyd in Debug mode (setup.py build --debug).
    • In MATLAB: pyversion 'c:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\python.exe'
    • Be really annoyed by having to restart MATLAB before you can actually do that.
    • After restarting MATLAB and running said command, in MSVC: DebugAttach to Process → select MATLAB.exe
    • Run the MATLAB code that causes the crash.
    • MATLAB/Python complains: Python Error: ImportError: DLL load failed: The specified module could not be found.
    • Try the same as in 1., this time renaming the my_pylib.pyd file to my_pylib_d.pyd (as found here of all places...)
    • MATLAB/Python complains: Python Error: ImportError: cannot import name 'my_pylib'
    • Try the same as 2., this time stating pyversion 'c:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\python_d.exe' (first making sure that a Python3.6 debugging environment has been installed in the MSVC context).
    • Be REALLY annoyed by having to restart MATLAB Every. Fucken. Time. you run that command!
    • Go to The MathWorks Support site and put in a Feature Request to try and get that fixed in R2019a
    • Restart MATLAB, re-attach MSVC to MATLAB.exe and repeat from the top
    • MATLAB/Python still complains: Python Error: ImportError: cannot import name 'my_pylib'
    • Repeat 3., this time forcing MATLAB to also use python36_d.dll because somehow that mechanism seems broken in MATLAB.
    • SCREAM IN RAGE because you have to restart MATLAB AGAIN and not forget to re-attach to the MATLAB.exe process in MSVC AGAIN.
    • This time it at least continues with Python code execution, but it trips on every imported module (numpy, scipy, etc.) with the same ImportError as before ...
    • Give up on Python3.6.
    • Try again, from the top, using a fresh, non-MSVC installation of Python 3.7, including debugging symbols etc.
    • in MATLAB: pyversion 'c:\wherever\Python37_64\python.exe'
    • <speak_angrily_through_teeth> Oh. Yeah. I forgot. Restart MATLAB AGAIN, run pyversion command above AGAIN, run offending MATLAB code, SCREAM WHILE PULLING OUT HAIR because you forgot to re-attach MSVC, restart MATLAB AGAIN because it obviously crashed, re-attach MSVC, run offending MATLAB code </speak_angrily_through_teeth>
    • Success! MSVC goes to my API code after MATLAB triggers a breakpoint. The breakpoint is triggered here:

      PyObject *module = PyModule_Create(&moduledef);
      

      where

      static struct PyModuleDef moduledef = 
      {
          PyModuleDef_HEAD_INIT,
          "my_pylib",
          NULL,
          sizeof(struct module_state),
          my_pylib_methods,
          NULL,
          my_pylib_traverse,
          my_pylib_clear,
          NULL
      };
      

      After downloading the Python 3.7 source code, I can dig a bit deeper. The PyModule_Create call is a wrapper that calls the following function in Objects/moduleobject.c:

      PyObject *
      PyModule_Create2(struct PyModuleDef* module, int module_api_version)
      {
          if (!_PyImport_IsInitialized(PyThreadState_GET()->interp))
              Py_FatalError("Python import machinery not initialized");
          return _PyModule_CreateInitialized(module, module_api_version);
      }
      

      where the breakpoint is inside the if() clause. This means that the moduledef doesn't even matter. _PyImport_IsInitialized() is a function in Python/import.c:

      int
      _PyImport_IsInitialized(PyInterpreterState *interp)
      {
          if (interp->modules == NULL)
              return 0;
          return 1;
      }
      

      which doesn't seem like a very likely candidate to be causing my Access Violation®. Going into the PyThreadState_GET() finally made me realize: I'm actually debugging the Python/C++ API instead of my code...

Questions:

  • What the heck!! Why is debugging in MATLAB using Python3.6 so damn difficult?! What is the "proper" way to do it? I can't find much in the (online) documentation for it...
  • Is there a known issue with the Python 3.7 C/C++ API that could be causing this?
  • Am I doing something wrong / stupidly? Any tips/pointers on how I can find the problem in my C++ code more effectively?
Rody Oldenhuis
  • 37,726
  • 7
  • 50
  • 96
  • Did you find out where the code stops when the access violation occurs? – Alan Birtles Sep 05 '18 at 06:48
  • @AlanBirtles well, sorta...in Python 3.7 at least, it stops on that first (basic) object construction (`PyModule_Create`). With Python 3.6 (target version) I can't seem to get a debugger to work in this context ... – Rody Oldenhuis Sep 05 '18 at 07:34
  • Thanks for the rant... doesn’t make the question easier to read though. :) – Cris Luengo Sep 05 '18 at 13:43
  • If the problem is in C++ code, why not call it from a stand-alone executable? If you need to debug the Python interface also, make a Python script that crashes, and debug from Python. There is no need to dig through so many layers! Also, instrumentation is your friend. If the out-of-bounds write triggers this, MATLAB might crash but a stand-alone executable might not. Instrumentation (e.g. sanitizer) might help find it quicker. – Cris Luengo Sep 05 '18 at 13:48
  • @CrisLuengo Thanks. It's true that these layers are in the way, and I'll definitely try to come up with a way to reproduce it outside of MATLAB. It's just that, I'd like to learn how to debug in this context because it's not going away anytime soon. And so far, only the MATLAB code has ever produced this particular problem - none of our tests or programs have ever found this problem. We compile with `/W4` and use basic static analysis (`CppCheck`), and all of these mechanisms are silent. I'll definitely give Sanitizer a try, though, sounds promising. – Rody Oldenhuis Sep 05 '18 at 20:42

0 Answers0