5

I have a large program written in C++ that I wish to make usable via Python. I've written a python extension to expose an interface through which python code can call the C++ functions. The issue I'm having with this is that installing seems to be nontrivial.

All documentation I can find seems to indicate that I should create a setup.py which creates a distutils.core.Extension. In every example I've found, the Extension object being created is given a list of source files, which it compiles. If my code was one or two files, this would be fine. Unfortunately, it's dozens of files, and I use a number of relatively complicated visual studio build settings. As a result, building by listing .c files seems to be challenging to say the least.

I've currently configured my Python extension to build as a .dll and link against python39.lib. I tried changing the extension to .pyd and including the file in a manifest.in. After I created a setup.py and ran it, it created a .egg file that I verified did include the .pyd I created. However, after installing it, when I imported the module into python, the module was completely empty (and I verified that the PyInit_[module] function was not called). Python dll Extension Import says that I can import the dll if I change the extension to .pyd and place the file in the Dlls directory of python's installation. I've encountered two problems with this.

The first is that it seems to me that it's not very distributable like this. I'd like to package this into a python wheel, and I'm not sure how a wheel could do this. The second is even more problematic - it doesn't exactly work. It calls the initialization function of my extension, and I've verified in WinDbg that it's returning a python module. However, this is what I always get from the console.

>>> import bluespawn
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
SystemError: initialization of bluespawn did not return an extension module

The Python documentation has a section on publishing binary extensions, but for the past four years, it has been left as a placeholder. The github issue linked here isn't that helpful either; it boils down to either use distutils to build or use enscons to build. But since my build is a fairly complicated procedure, completely rewriting it to use enscons is less than desirable, to say the least. Python Documentation

It seems to me like placing the file in the DLLs directory is the wrong way of going about this. Given that I have a DLL and making setuptools compile everything itself seems infeasible, how should I go about installing my extension?

For reference, here's my initialization function, in case that's incorrect.

PyModuleDef bsModule{ PyModuleDef_HEAD_INIT, "bluespawn", "Bluespawn python bindings", -1, methods };
 
PyMODINIT_FUNC PyInit_bluespawn() {
    PyObject* m; 
    Py_Initialize();
    PyEval_InitThreads();
    PyGILState_STATE state = PyGILState_Ensure(); // Crashes without this. Call to PyEval_InitThreads() required for this. 
    m = PyModule_Create(&bsModule);
    PyGILState_Release(state);
    Py_Finalize();
    return m;
}

The python interface is available here: https://github.com/ION28/BLUESPAWN/blob/client-add-pylib/BLUESPAWN-win-client/src/user/python/PythonInterface.cpp

EDIT: I have a working solution that I am sure is not best practice. I created a very small C file that simply passes all calls it receives onto the large DLL I've already created. The C file is responsible for initializing the module, but everything else is handled inside the DLL. It works, but it seems like a very bad way of doing things. What I'm looking for is a better way of doing this.

James McDowell
  • 2,668
  • 1
  • 14
  • 27
  • So the *.pyd* works when it gets imported? Then it would be a good idea to build it externally and only reference it from *setup.py* (to include it in the *.whl* - which should install it in the *site-packages* *dir*). But the file you shared is odd: it is not a core module, then there are those empty function definitions, and then all the `extern "C" __declspec(dllexport)`, and also the *PyInit* function is not there. I'm not sure what are you aiming to export from it. – CristiFati Mar 30 '21 at 21:29
  • Maybe https://stackoverflow.com/questions/49493537/how-to-implement-fips-mode-and-fips-mode-set-in-python-3-6s-ssl-module, or https://stackoverflow.com/questions/61692747/how-to-extend-python-and-make-a-c-package/61880469#61880469 might help. I could assist you in going forward, but this week I'm kind of caught. – CristiFati Mar 30 '21 at 22:09
  • Might I suggest you look at `pybind11` for your bindings? It's a great library and it's probably easier than writing the binding code yourself. – unddoch Mar 30 '21 at 22:44
  • There's a file in the same folder as PythonInterface.cpp called bs_shims.c which I'm currently using. It gets dropped in the build folder with the resulting dll and built by setup.py. It is responsible for calling PyInit, and the module definition in it passes references to the exported functions in the Dll. – James McDowell Mar 31 '21 at 00:38

1 Answers1

3

Let me try and divide your post into two separate questions:

  1. How to package a C++ library with a non-trivial compilation process using setuptools
  2. Is it possible to distribute a python package with a precompiled library

1. How to package a C++ library with a non-trivial compilation process using setuptools

It is possible. I was quite surprised to see that setuptools offers many ways to override the compilation process, see the documentation here. For example, you can use the keyword argument extra_compile_args to pass extra arguments to the compiler. In addition, as setup.py is a python file, you could relatively easily write some code to automatically collect all files needed for compilation. I'd done this myself in a project (github), and it worked quite well for me.

Here's some code from the setup.py:

libinjector = Extension('pyinjector.libinjector',
                        sources=[str(c.relative_to(PROJECT_ROOT))
                                 for c in [LIBINJECTOR_WRAPPER, *LIBINJECTOR_SRC.iterdir()]
                                 if c.suffix == '.c'],
                        include_dirs=[str(LIBINJECTOR_DIR.relative_to(PROJECT_ROOT) / 'include')],
                        export_symbols=['injector_attach', 'injector_inject', 'injector_detach'],
                        define_macros=[('EM_AARCH64', '183')])

2. Is it possible to distribute a python package with a precompiled library

I understand from your edit that you've managed to get it to work, but I'll say a few words anyway. Releasing precompiled binaries with your source distribution is possible, and it is possible to release your manually-compiled binaries in a wheel file as well, but it is not recommended.

The main reason is compatibility with the target architecture. First, you'll have to include two DLLs in your distribution, one for x64 and one for x86. Second, you might lose some nice optimizations, because you'll have to instruct the compiler to ignore optimizations available for the specific CPU type (note that this applies to normal wheel distributions as well). If you're compiling against windows SDK, you'll probably want to use the user's version too. In addition, including two DLLs in your release might grow it to an awkward size for a source distribution.

kmaork
  • 5,722
  • 2
  • 23
  • 40
  • I'm compiling against windows XP sdk for a lot of backwards compatibility. I want this to be easily installable, which means I don't want people to have to have the whole windows SDK plus all the libraries I link against. The build process (including vcpkg dependencies) takes about 45 minutes, which is definitely less than ideal. I could just just copy the exact command line used to build and link, but I also use a lot of custom build steps and require building (and using) some custom dependencies. In short, building the full project on the host is my last option. – James McDowell Mar 26 '21 at 16:03
  • My sympathies :) And I understand from your edit that you got it to work, no? What is the problem you are facing right now? – kmaork Mar 26 '21 at 16:06
  • I got it to work, but my solution is definitely not ideal. It involves dropping the DLL on the system then building a single source file using distutils that links against the DLL at runtime. As I understand it, both distutils and setuptools results in a .egg-info and a .pyd file. I see no reason why I shouldn't be able to make this .egg-info and .pyd on my own so that I can install without requiring building on the host. – James McDowell Mar 26 '21 at 16:09
  • I'll note that I diagnosed the issue with placing the DLL (renamed to .pyd) to being an issue with my library using a separate python runtime. So the module it returned was initialized in a separate python runtime and invalid in the python interpreter's runtime. – James McDowell Mar 26 '21 at 16:10
  • So now you are able to manually import the `.pyd`, and just want to understand how to bundle it into your distribution not too awkwardly? – kmaork Mar 26 '21 at 16:13
  • The DLL I create (renamed to pyd) is linked against python39.lib. When python loads this library and calls the `PyInit_{module}` function, the library isn't properly initialized with the existing python runtime. So when it calls `PyModule_Create`, it creates the module in a separate python runtime. The resulting module is incompatible with the existing python runtime. And so it can't be used by python properly. The binaries created by setuptools _do_ use the correct python runtime. How do I make mine do the same? Do I need my own .egg-info? – James McDowell Mar 26 '21 at 16:17
  • If I'm following you, it sounds like you are linking statically against python39.lib. Try to make sure that instead, you are linking dynamically against Python.h. I think you should't need python sources or python39.lib at all – kmaork Mar 26 '21 at 16:21
  • I'll give it a shot later today. – James McDowell Mar 26 '21 at 16:33
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/230456/discussion-between-kmaork-and-james-mcdowell). – kmaork Mar 27 '21 at 23:04