2

I'm developing a python module that is written in C++ but callable from python via bindings written via pybind11. I'm trying to cross-compile for M1 Macs from an Intel Mac using Apple's XCode Toolchain.

Interestingly, for some users the package works, whereas other users experience the error below:

ImportError: dlopen(/Users/frieda/miniconda3/envs/behavior/lib/python3.8/site-packages/banditpam.cpython-38-darwin.so, 0x0002): symbol not found in flat namespace '_omp_get_max_threads'

How can I statically link OpenMP so that it ships with the python wheel?

What I've tried:

From a similar SO question for Boost, I think something is going wrong with a libstdc++ vs. libc++ library, which I think can be fixed by statically linking the OpenMP library in our package. How can this be done, for each of clang and gcc in our setup.py? I've tried various linker flags such as -static-libgomp but they don't seem to work.

Do I also need to include the -fPIC flag?

Caveat: this comment recommends against linking OpenMP statically. Is there a better way to expose OpenMP functions via python bindings to our users?

EDIT: The setup.py is here; I'm building via cibuildwheel on the Github Runners to cross-compile from a Github Intel Mac to an M1 Mac (Github M1 Mac Runners are not easily accessible); see here.

Users on M1 Mac are able to install the built wheel, but some of them get the runtime error above when importing the library. The offending line is here.

Cris Luengo
  • 55,762
  • 10
  • 62
  • 120
  • Can you post your setup.py and other build system details? – unddoch Feb 10 '22 at 17:54
  • Thanks @unddoch! I have edited my original post with the `setup.py` and more details on the build system. Let me know if I can provide any other details. – mlquestions Feb 10 '22 at 20:54

1 Answers1

1

The delocate package will move dynamic libraries into your package, and update the binaries in your package to use those local copies. It is mostly trivial in use, it examines the binaries to discover their dependencies, you don’t need to do any configuration.

You use it like this after creating the wheel, before distribution or uploading it to PyPI:

delocate-wheel -w fixed_wheels -v scipy-0.14.0-cp34-cp34m-macosx_10_6_intel.whl

The fixed_wheels directory will contain the updated wheel that you distribute.


We use this in the DIPlib project to deploy wheels containing (“vendoring”) the OpenMP library. See here for our build script.. We customized delocate a bit so that it wouldn’t copy the Java VM library into the wheel, but this should not be a common use case.


But do note: I came here looking for a solution to the problem of importing multiple different packages into Python that use the OpenMP library, but vendor different, incompatible versions. One DIPlib user ran into this problem, but I haven’t been able to reproduce it on my, very similar, macOS machine. I don’t know yet how common it is for this problem to show up.

Edit: Apparently this problem is not at all common:

The only unrecoverable incompatibility we encountered happens when loading a mix of compiled extensions linked with libomp (LLVM/Clang) and libiomp (ICC), on Linux and macOS, manifested by crashes or deadlocks. It can happen even with the simplest OpenMP calls like getting the maximum number of threads that will be used in a subsequent parallel region. A possible explanation is that libomp is actually a fork of libiomp causing name colliding for instance. Using threadpoolctl may crash your program in such a setting.

Fortunately this problem is very rare: at the time of writing, all major binary distributions of Python packages for Linux use either GCC or ICC to build the Python scientific packages. Therefore this problem would only happen if some packagers decide to start shipping Python packages built with LLVM/Clang instead of GCC (this is the case for instance with conda's default channel).

Cris Luengo
  • 55,762
  • 10
  • 62
  • 120