3

I have a question about good practices for writing a C++ library that includes wrappers for other languages, such as Python and Matlab. This may be a simple question or a duplicate, but I haven't found a good resource or another answer which helps explain how to do this.

For background, I am working on a C++ library project that has C wrappers included specifically for compatibility with other languages. The library is a scientific computing library written in C++, and I have already written the C wrappers for the functions and classes to be used as part of a shared library.

My question is how to incorporate the C wrappers into the modules for other languages, such as wrapper libraries for Python and Matlab. I'm not asking for specifics on how to implement the code for these other languages, because that is another question entirely, and I already have a basic understanding of how to write the code that can be built and loaded for each language as its own library. My question is mainly about including the C wrappers into these other builds.

For example, I have the following directory structure, where each folder contains code relevant to a different language/module.

top
├─ matlab
│  ├─ CMakeLists.txt
│  ├─ matlab_wrapper.hpp
│  └─ matlab_wrapper.cc
├─ python
│  ├─ CMakeLists.txt
│  ├─ python_wrapper.hpp
│  └─ python_wrapper.cc
├─ src
│  ├─ CMakeLists.txt
│  ├─ c_wrapper.hpp
│  ├─ c_wrapper.cc
│  └─ other code...
└─ CMakeLists.txt

I understand that each folder should be self-sufficient, and buildable on its own, but I am having trouble figuring out how to distribute the code so that it includes the C wrappers. Basically, I have the C implementation in the c_wrapper files and I would like to use this in the matlab and python libraries.

Normally, I would include the other source files during the build step and be done with it (this may just be the answer), but everything in the src folder is built and distributed as its own shared library. It seems redundant to either rewrite the C wrappers in the matlab_wrapper and python_wrapper files (basically creating 3 versions of the same function) or to include the c_wrapper files in the build step of the other libraries because they are already included in the main C++ library.

I'm not an expert in packaging shared libraries, so what is the best way to go about this? Should I just include the c_wrapper source files as part of the build step, rewrite the C++ wrappers in each module, or is there another way to include the code in c_wrapper in the wrapper libraries? Ideally, each module will share the same (or very similar) API, so it would be great if there were a way to include this code without rewriting the same function every time a new C wrapper is added. Any help or advice is greatly appreciated.

Adam
  • 303
  • 3
  • 11
  • Why aren’t you just including headers from, and linking to, the main library in each wrapper? – Davis Herring May 27 '18 at 05:35
  • @DavisHerring This may be that I just don't know how to do this properly, but if I do that, wouldn't I need to write a wrapper of a wrapper? – Adam May 27 '18 at 05:52
  • @Adam: Not sure what you mean by that -- you would end up with 2 shared libraries, the C interface depending on the C++ one. – Acorn May 27 '18 at 08:10
  • If this is intended to be used from many languages, typically you want to expose only a "C" interface in the main shared library. Then, you create bindings for the different languages, including C++. For this one, you don't really need bindings, but if you want to simplify usage, you may provide C++ classes that wrap the C API; and this may not even need to be a shared library. – Acorn May 27 '18 at 08:14
  • @Acorn Maybe I misunderstood. I guess what I mean is if all of the C++/C wrapper code is in `src` and becomes one library, then if the other language binding libraries import the C interface and header files, then they would still have to wrap that again, correct? Maybe this is the best way? In the end, I was thinking there would be one library for each language along with the main library. Not sure if that's the right way, though. – Adam May 27 '18 at 09:33
  • @Adam: Yes, you get two wrappings, because the outer one has a signature determined by the host language (_e.g._, `PyObject*(PyObject*,PyObject*)` and so cannot be shared. You could instead write a wrapper _in_ Python using `ctypes`, but the count of wrappers is the same. – Davis Herring May 27 '18 at 13:58
  • @DavisHerring Thank you for the info. It seems like there will always be a wrapper for each language, meaning somewhat redundant functions in the whole library across all folders. I guess I was hoping I wouldn't have to redefine the wrappers each time if they're going to be identical. In Python, they'll be a little different due to `PyObject`, but in Matlab, for example, the C wrappers and the Matlab wrappers are basically the same. – Adam May 27 '18 at 22:08
  • @Adam: If the wrapper that Matlab wants is itself suitable for use _by_ the (say) Python wrapper, nothing stops you from using your library's C interface directly for Matlab and (re)wrapping it for Python. – Davis Herring May 27 '18 at 22:35
  • @DavisHerring Thank you. So on a related note, is there any benefit to using the C wrappers in the Matlab or Python wrapper libraries, or is it better to wrap the C++ functions individually in each module? I know they will all vary slightly, such as in the case of the Python/C extension wrapper, but would it be easiest to wrap the C++ code or to wrap the C wrapper? Also, should this be an answer? – Adam May 27 '18 at 23:06
  • @Adam: Using C wrappers in the same binary as the C++ code can avoid some ABI problems, but those can also be addressed without extra code by controlling the compilations. If you want an answer, I’ll make one, but it will largely say “compile and link like always”. – Davis Herring May 28 '18 at 02:51
  • @DavisHerring If that's the answer, then it would be good to be an answer instead of a comment, I think. It may be good to state that since each language has its own way of writing the C bindings, that it's best to rewrite the wrappers for every language being used. Then "compile and link like always". – Adam May 28 '18 at 17:29

1 Answers1

1

You might be able to make a single shared library for all purposes: the PyInit_foo that Python looks for will simply be ignored by Matlab, after all. But you still might not want to alter your main library to support such uses: maybe it has other compiled clients that need no wrapper, or maybe it’s supposed to be installed separately from special wrappers, or maybe it needs to be usable on a machine without the hosts installed.

Another option is to make one shared library for each ultimate client. Linking the same object files into each defeats some of the purpose of shared libraries, like sharing memory between a Matlab process and a concurrent Python process each using your library. It might (I’m not sure) also end up running global constructors more than once, but those are best avoided anyway. These issues might not matter for your use cases, but the same build issues as before for the real C++ library apply.

Otherwise, you’ll have multiple shared libraries in one process (one for the “real” C++ library and another for the host language module). There are ABI issues here; one way to avoid them is by providing, as you suggested, a C API in the core library. There are, of course, other ways of dealing with those issues, especially when you control all the compilations.

There are yet more approaches: the main library could include the interface for one host language (especially if it can also serve as the C API) but not another, or one shared library could serve multiple languages without including the C++ core. Given the C API, you may be able to use a FFI from the host language (ctypes for Python) instead of writing (more) C specifically for the host.

Whatever link strategy you choose, compiling is always the same: just #include whatever of your headers are relevant (those for the C API if appropriate) and any needed for the host language (if any) and go.

Davis Herring
  • 36,443
  • 4
  • 48
  • 76