52

I'm writing a Python extension that links a C++ library and I'm using cmake to help with the build process. This means that right now, the only way I know how to bundle it, I have to first compile them with cmake before I can run setup.py bdist_wheel. There must be a better way.

I was wondering if it's possible (or anybody has tried) to invoke CMake as part of the setup.py ext_modules build process? I'm guessing there is a way to create a subclass of something but I'm not sure where to look.

I'm using CMake because it gives me so much more control for building c and c++ libraries extensions with complex build steps exactly as I want it. Plus, I can easily build Python extensions directly with cmake with the PYTHON_ADD_MODULE() command in the findPythonLibs.cmake. I just wish this was all one step.

loneraver
  • 1,282
  • 2
  • 15
  • 22
  • 2
    any luck solving this? I'm facing a very similar challenge. For the time being, I added a custom target that depends on the target that builds the binaries and hacks together a setup.py that includes them as `package_data`, but it all looks like a big hack. It feels like there must be a better way – jorgeh Apr 12 '17 at 16:21
  • no luck myself. That's exactly how I've been doing it and it feels very hacky. I wish I knew a better way. – loneraver Apr 13 '17 at 00:02

3 Answers3

63

What you basically need to do is to override the build_ext command class in your setup.py and register it in the command classes. In your custom impl of build_ext, configure and call cmake to configure and then build the extension modules. Unfortunately, the official docs are rather laconic about how to implement custom distutils commands (see Extending Distutils); I find it much more helpful to study the commands code directly. For example, here is the source code for the build_ext command.

Example project

I have prepared a simple project consisting out of a single C extension foo and a python module spam.eggs:

so-42585210/
├── spam
│   ├── __init__.py  # empty
│   ├── eggs.py
│   ├── foo.c
│   └── foo.h
├── CMakeLists.txt
└── setup.py

Files for testing the setup

These are just some simple stubs I wrote to test the setup script.

spam/eggs.py (only for testing the library calls):

from ctypes import cdll
import pathlib


def wrap_bar():
    foo = cdll.LoadLibrary(str(pathlib.Path(__file__).with_name('libfoo.dylib')))
    return foo.bar()

spam/foo.c:

#include "foo.h"

int bar() {
    return 42;
}

spam/foo.h:

#ifndef __FOO_H__
#define __FOO_H__

int bar();

#endif

CMakeLists.txt:

cmake_minimum_required(VERSION 3.10.1)
project(spam)
set(src "spam")
set(foo_src "spam/foo.c")
add_library(foo SHARED ${foo_src})

Setup script

This is where the magic happens. Of course, there is a lot of room for improvements - you could pass additional options to CMakeExtension class if you need to (for more info on the extensions, see Building C and C++ Extensions), make the CMake options configurable via setup.cfg by overriding methods initialize_options and finalize_options etc.

import os
import pathlib

from setuptools import setup, Extension
from setuptools.command.build_ext import build_ext as build_ext_orig


class CMakeExtension(Extension):

    def __init__(self, name):
        # don't invoke the original build_ext for this special extension
        super().__init__(name, sources=[])


class build_ext(build_ext_orig):

    def run(self):
        for ext in self.extensions:
            self.build_cmake(ext)
        super().run()

    def build_cmake(self, ext):
        cwd = pathlib.Path().absolute()

        # these dirs will be created in build_py, so if you don't have
        # any python sources to bundle, the dirs will be missing
        build_temp = pathlib.Path(self.build_temp)
        build_temp.mkdir(parents=True, exist_ok=True)
        extdir = pathlib.Path(self.get_ext_fullpath(ext.name))
        extdir.mkdir(parents=True, exist_ok=True)

        # example of cmake args
        config = 'Debug' if self.debug else 'Release'
        cmake_args = [
            '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=' + str(extdir.parent.absolute()),
            '-DCMAKE_BUILD_TYPE=' + config
        ]

        # example of build args
        build_args = [
            '--config', config,
            '--', '-j4'
        ]

        os.chdir(str(build_temp))
        self.spawn(['cmake', str(cwd)] + cmake_args)
        if not self.dry_run:
            self.spawn(['cmake', '--build', '.'] + build_args)
        # Troubleshooting: if fail on line above then delete all possible 
        # temporary CMake files including "CMakeCache.txt" in top level dir.
        os.chdir(str(cwd))


setup(
    name='spam',
    version='0.1',
    packages=['spam'],
    ext_modules=[CMakeExtension('spam/foo')],
    cmdclass={
        'build_ext': build_ext,
    }
)

Testing

Build the project's wheel, install it. Test the library is installed:

$ pip show -f spam
Name: spam
Version: 0.1
Summary: UNKNOWN
Home-page: UNKNOWN
Author: UNKNOWN
Author-email: UNKNOWN
License: UNKNOWN
Location: /Users/hoefling/.virtualenvs/stackoverflow/lib/python3.6/site-packages
Requires: 
Files:
  spam-0.1.dist-info/DESCRIPTION.rst
  spam-0.1.dist-info/INSTALLER
  spam-0.1.dist-info/METADATA
  spam-0.1.dist-info/RECORD
  spam-0.1.dist-info/WHEEL
  spam-0.1.dist-info/metadata.json
  spam-0.1.dist-info/top_level.txt
  spam/__init__.py
  spam/__pycache__/__init__.cpython-36.pyc
  spam/__pycache__/eggs.cpython-36.pyc
  spam/eggs.py
  spam/libfoo.dylib

Run the wrapper function from spam.eggs module:

$ python -c "from spam import eggs; print(eggs.wrap_bar())"
42
Contango
  • 76,540
  • 58
  • 260
  • 305
hoefling
  • 59,418
  • 12
  • 147
  • 194
  • 1
    Any reason for using distutils.command's build_ext instead of ```from setuptools.command.build_ext import build_ext```? –  Jun 14 '18 at 03:25
  • Sorry one last thing; is overriding the ```run()``` method really necessary and why, as well as using a new ```build_cmake()``` method instead of just placing it into ```build_ext.build_extension()```. The reason I ask is I am trying to make my setup.py reflect as close to what happens in a "normal" package and thought going with overriding "build_extension" being what made more sense, but now I am running into a lot of weird exceptions. Just wondering what the motivation was there. –  Jun 26 '18 at 23:24
  • Okay, nevermind then! I see exactly why you were overriding the run module. And why the extension directory must be present! Why on earth that "tricks" super().run() to not execute and attempt to build the extension by itself is a mystery to me. I thought I could cheat the process and just jump straight to the parent directory of `extdir` but apparently not. This topic could use volumes more documentation... –  Jun 28 '18 at 01:07
  • 1
    Indeed, I remember trying to override `build_extension` and running into errors I couldn't resolve. Let me review this once I'm back at work, I'll update the answer and ping you if you'll still be interested. – hoefling Jun 28 '18 at 10:20
  • 1
    I am definately still interested, and thank you for your time. –  Jul 07 '18 at 16:48
  • 2
    For me the 'extdir.mkdir' part makes a directory with the name of the shared library of the extension, which then makes the build fail when it tries to create the shared library with that name. I changed it to extdir.parent.mkdir(...) to get around this. – Ben Farmer Oct 31 '18 at 12:46
  • @BenFarmer This is exactly as I did. It is worth noting that (I believe) if your extension is a single .pyd file, naming the extension after the extension directory should work okay. –  Nov 08 '18 at 14:08
  • @hoefling, Perhaps do you know if there's a way to use the cmake approach you've demonstrated in your example, and use wrapper C file that include from `python.h` instead of ctypes (`cdll.LoadLibrary`) ? thanks – Zohar81 Dec 03 '18 at 08:31
  • @Zohar81 it shouldn't make any difference whether you have a C wrapper with a Python interface. It only affects the includes - make sure you have included matching Python headers. Should you have any troubles, I'd suggest asking another question with a failing code snippet. – hoefling Dec 04 '18 at 23:32
  • 2
    One more thing; you have a line in there: `config = 'Debug' if self.debug else 'Release'`. How do set `self.debug` to true? Is there some command line flag that does it during installation? I couldn't seem to find one. – Ben Farmer Dec 05 '18 at 10:15
  • 1
    @BenFarmer it's a flag for the stdlib's `build_ext` command. E.g. `python setup.py build_ext --debug` or `debug = 1` in `[build_ext]` section of the setup config. – hoefling Dec 05 '18 at 10:46
  • On error `Cmake Error: could not load cache`, see https://stackoverflow.com/questions/16319292/cmake-error-could-not-load-cache, solution is to delete `CMakeCache.txt` in the root and try again. – Contango Jun 01 '20 at 16:22
  • Hi, on windows this compiles fine but it creates the "Release" folder. I assume it is a VS thing, but it makes the installation be corrupt – Santi Peñate-Vera Jul 03 '20 at 13:34
  • how do I upload this to PyPI with the CMakeLists.txt included in the upload? – user352102 Feb 04 '21 at 02:50
  • 1
    @user352102 you record `CMakeLists.txt` in `MANIFEST.in`, see [Python docs to sdists](https://python.readthedocs.io/en/stable/distutils/sourcedist.html#specifying-the-files-to-distribute). It's better to build and upload wheels though, as they already contain prebuilt extensions and do not require CMake or dependent libs to be installed on target system. – hoefling Feb 04 '21 at 08:22
  • @hoefling let's say for some reason the wheel is not working, how can I assure that the .so (or .dylib) file created by the CMakeLists.txt is placed in site-packages permanently and is locatable? – user352102 Feb 04 '21 at 22:09
  • @user352102 by listing them in the `ext_modules` list. If you want to include and install arbitrary libraries, this is not what the sdist format is for; it is for installing Python modules and extension modules exclusively. – hoefling Feb 04 '21 at 22:18
  • 1
    Consider using [scikit-build](https://scikit-build.readthedocs.io/en/latest/usage.html) instead of a hand-rolled solution. It uses CMake natively. – Alex Reinking Jul 27 '21 at 16:07
18

I would like to add my own answer to this, as a sort of addendum to what hoefling described.

Thanks, hoefling, as your answer helped get me on track towards writing a setup script in much the same manner for my own repository.

Preamble

The primary motivation for writing this answer is trying to "glue together" the missing pieces. The OP does not state the nature of the C/ C++ Python module being developed; I'd like to make it clear up front that the below steps are for a C/ C++ cmake build chain that creates multiple .dll/ .so files as well as a precompiled *.pyd/so file in addition to some generic .py files that need to be placed in the scripts directory.

All of these files come to fruition directly after the cmake build command is run... fun. There is no recommendation for building a setup.py this way.

Because setup.py implies that your scripts are going to be some part of your package/ library and that .dll files that need to be built must be declared through the libraries portion, with sources and include dirs listed, there is no intuitive way to tell setuptools that the libraries, scripts and data files that are resultant of one call to cmake -b that occured in build_ext should all go in their own respective places. Worse still if you want to have this module be tracked by setuptools and fully uninstallable, meaning users can uninstall it and have every trace wiped off their system, if so desired.

The module that I was writing a setup.py for is bpy, the .pyd/ .so equivalent of building blender as a python module as described here:

https://wiki.blender.org/wiki//User:Ideasman42/BlenderAsPyModule (better instructions but now dead link) http://www.gizmoplex.com/wordpress/compile-blender-as-python-module/ (possibly worse instructions but seems to be online still)

You can check out my repository on github here:

https://github.com/TylerGubala/blenderpy

That is my motivation behind writing this answer, and hopefully will help anyone else trying to accomplish something similar rather than throwing away their cmake build chain or, worse yet, having to maintain two separate build environments. I apologize if it is off topic.

So what do I do to accomplish this?

  1. Extend the setuptools.Extension class with a class of my own, which does not contain entries for the sources or libs properties

  2. Extend the setuptools.commands.build_ext.build_ext class with a class of my own, which has a custom method which performs my necessary build steps (git, svn, cmake, cmake --build)

  3. Extend the distutils.command.install_data.install_data class (yuck, distutils... however there doesn't seem to be a setuputils equivalent) with a class of my own, to mark the built binary libraries during setuptools' record creation (installed-files.txt) such that

    • The libraries will be recorded and will be uninstalled with pip uninstall package_name

    • The command py setup.py bdist_wheel will work natively as well, and can be used to provide precompiled versions of your source code

  4. Extend the setuptools.command.install_lib.install_lib class with a class of my own, which will ensure that the built libraries are moved from their resultant build folder into the folder that setuptools expects them in (on Windows it will put the .dll files in a bin/Release folder and not where setuptools expects it)

  5. Extend the setuptools.command.install_scripts.install_scripts class with a class of my own such that the scripts files are copied to the correct directory (Blender expects the 2.79 or whatever directory to be in the scripts location)

  6. After the build steps are performed, copy those files into a known directory that setuptools will copy into the site-packages directory of my environment. At this point the remaining setuptools and distutils classes can take over writing the installed-files.txt record and will be fully removable!

Sample

Here is a sample, more or less from my repository, but trimmed for clarity of the more specific things (you can always head over to the repo and look at it for yourself)

from distutils.command.install_data import install_data
from setuptools import find_packages, setup, Extension
from setuptools.command.build_ext import build_ext
from setuptools.command.install_lib import install_lib
from setuptools.command.install_scripts import install_scripts
import struct

BITS = struct.calcsize("P") * 8
PACKAGE_NAME = "example"

class CMakeExtension(Extension):
    """
    An extension to run the cmake build

    This simply overrides the base extension class so that setuptools
    doesn't try to build your sources for you
    """

    def __init__(self, name, sources=[]):

        super().__init__(name = name, sources = sources)

class InstallCMakeLibsData(install_data):
    """
    Just a wrapper to get the install data into the egg-info

    Listing the installed files in the egg-info guarantees that
    all of the package files will be uninstalled when the user
    uninstalls your package through pip
    """

    def run(self):
        """
        Outfiles are the libraries that were built using cmake
        """

        # There seems to be no other way to do this; I tried listing the
        # libraries during the execution of the InstallCMakeLibs.run() but
        # setuptools never tracked them, seems like setuptools wants to
        # track the libraries through package data more than anything...
        # help would be appriciated

        self.outfiles = self.distribution.data_files

class InstallCMakeLibs(install_lib):
    """
    Get the libraries from the parent distribution, use those as the outfiles

    Skip building anything; everything is already built, forward libraries to
    the installation step
    """

    def run(self):
        """
        Copy libraries from the bin directory and place them as appropriate
        """

        self.announce("Moving library files", level=3)

        # We have already built the libraries in the previous build_ext step

        self.skip_build = True

        bin_dir = self.distribution.bin_dir

        # Depending on the files that are generated from your cmake
        # build chain, you may need to change the below code, such that
        # your files are moved to the appropriate location when the installation
        # is run

        libs = [os.path.join(bin_dir, _lib) for _lib in 
                os.listdir(bin_dir) if 
                os.path.isfile(os.path.join(bin_dir, _lib)) and 
                os.path.splitext(_lib)[1] in [".dll", ".so"]
                and not (_lib.startswith("python") or _lib.startswith(PACKAGE_NAME))]

        for lib in libs:

            shutil.move(lib, os.path.join(self.build_dir,
                                          os.path.basename(lib)))

        # Mark the libs for installation, adding them to 
        # distribution.data_files seems to ensure that setuptools' record 
        # writer appends them to installed-files.txt in the package's egg-info
        #
        # Also tried adding the libraries to the distribution.libraries list, 
        # but that never seemed to add them to the installed-files.txt in the 
        # egg-info, and the online recommendation seems to be adding libraries 
        # into eager_resources in the call to setup(), which I think puts them 
        # in data_files anyways. 
        # 
        # What is the best way?

        # These are the additional installation files that should be
        # included in the package, but are resultant of the cmake build
        # step; depending on the files that are generated from your cmake
        # build chain, you may need to modify the below code

        self.distribution.data_files = [os.path.join(self.install_dir, 
                                                     os.path.basename(lib))
                                        for lib in libs]

        # Must be forced to run after adding the libs to data_files

        self.distribution.run_command("install_data")

        super().run()

class InstallCMakeScripts(install_scripts):
    """
    Install the scripts in the build dir
    """

    def run(self):
        """
        Copy the required directory to the build directory and super().run()
        """

        self.announce("Moving scripts files", level=3)

        # Scripts were already built in a previous step

        self.skip_build = True

        bin_dir = self.distribution.bin_dir

        scripts_dirs = [os.path.join(bin_dir, _dir) for _dir in
                        os.listdir(bin_dir) if
                        os.path.isdir(os.path.join(bin_dir, _dir))]

        for scripts_dir in scripts_dirs:

            shutil.move(scripts_dir,
                        os.path.join(self.build_dir,
                                     os.path.basename(scripts_dir)))

        # Mark the scripts for installation, adding them to 
        # distribution.scripts seems to ensure that the setuptools' record 
        # writer appends them to installed-files.txt in the package's egg-info

        self.distribution.scripts = scripts_dirs

        super().run()

class BuildCMakeExt(build_ext):
    """
    Builds using cmake instead of the python setuptools implicit build
    """

    def run(self):
        """
        Perform build_cmake before doing the 'normal' stuff
        """

        for extension in self.extensions:

            if extension.name == 'example_extension':

                self.build_cmake(extension)

        super().run()

    def build_cmake(self, extension: Extension):
        """
        The steps required to build the extension
        """

        self.announce("Preparing the build environment", level=3)

        build_dir = pathlib.Path(self.build_temp)

        extension_path = pathlib.Path(self.get_ext_fullpath(extension.name))

        os.makedirs(build_dir, exist_ok=True)
        os.makedirs(extension_path.parent.absolute(), exist_ok=True)

        # Now that the necessary directories are created, build

        self.announce("Configuring cmake project", level=3)

        # Change your cmake arguments below as necessary
        # Below is just an example set of arguments for building Blender as a Python module

        self.spawn(['cmake', '-H'+SOURCE_DIR, '-B'+self.build_temp,
                    '-DWITH_PLAYER=OFF', '-DWITH_PYTHON_INSTALL=OFF',
                    '-DWITH_PYTHON_MODULE=ON',
                    f"-DCMAKE_GENERATOR_PLATFORM=x"
                    f"{'86' if BITS == 32 else '64'}"])

        self.announce("Building binaries", level=3)

        self.spawn(["cmake", "--build", self.build_temp, "--target", "INSTALL",
                    "--config", "Release"])

        # Build finished, now copy the files into the copy directory
        # The copy directory is the parent directory of the extension (.pyd)

        self.announce("Moving built python module", level=3)

        bin_dir = os.path.join(build_dir, 'bin', 'Release')
        self.distribution.bin_dir = bin_dir

        pyd_path = [os.path.join(bin_dir, _pyd) for _pyd in
                    os.listdir(bin_dir) if
                    os.path.isfile(os.path.join(bin_dir, _pyd)) and
                    os.path.splitext(_pyd)[0].startswith(PACKAGE_NAME) and
                    os.path.splitext(_pyd)[1] in [".pyd", ".so"]][0]

        shutil.move(pyd_path, extension_path)

        # After build_ext is run, the following commands will run:
        # 
        # install_lib
        # install_scripts
        # 
        # These commands are subclassed above to avoid pitfalls that
        # setuptools tries to impose when installing these, as it usually
        # wants to build those libs and scripts as well or move them to a
        # different place. See comments above for additional information

setup(name='my_package',
      version='1.0.0a0',
      packages=find_packages(),
      ext_modules=[CMakeExtension(name="example_extension")],
      description='An example cmake extension module',
      long_description=open("./README.md", 'r').read(),
      long_description_content_type="text/markdown",
      keywords="test, cmake, extension",
      classifiers=["Intended Audience :: Developers",
                   "License :: OSI Approved :: "
                   "GNU Lesser General Public License v3 (LGPLv3)",
                   "Natural Language :: English",
                   "Programming Language :: C",
                   "Programming Language :: C++",
                   "Programming Language :: Python",
                   "Programming Language :: Python :: 3.6",
                   "Programming Language :: Python :: Implementation :: CPython"],
      license='GPL-3.0',
      cmdclass={
          'build_ext': BuildCMakeExt,
          'install_data': InstallCMakeLibsData,
          'install_lib': InstallCMakeLibs,
          'install_scripts': InstallCMakeScripts
          }
    )

Once the setup.py has been authored this way, building the python module is as simple as running py setup.py, which will run the build and produce the outfiles.

It is recommended that you produce a wheel for users over slow internet or who do not want to build from sources. To do that, you will want to install the wheel package (py -m pip install wheel) and produce a wheel distribution by performing py setup.py bdist_wheel, and then upload it using twine like any other package.

  • Hi, I've got a project that have wrapper C file that include from `python.h` and provide some api to be used in python that call the C/C++ api and convert their outputs to `pyObject` type. is there an option to compile this in cmake and use the output `.so` library file to generate python module inside egg file (so far I only managed to do so by compiling directly from the `extension` object) – Zohar81 Dec 03 '18 at 08:33
  • @Zohar81 If you are able to compile using the `Extension` object, have a `setup.py` file structured similarly and have installed the latest version of `wheel` (`pip install wheel`) then running `setup.py bdist_wheel` should get you the desired result. If not, or if you have any additional questions, please let me know. –  Dec 03 '18 at 19:04
  • 1
    I adapted your code for my needs and noticed you don't need `InstallCMakeLibsData`, but instead can override `InstallCMakeLibs.get_outputs`. However, your code helped me a lot to find an entry to the setuputils and distutils code. – John Aug 27 '19 at 14:25
  • @John that's an interesting observation though I still tend to feel a little safer by subclassing `install_data` however it is up to developer preference and necessity at that point. Duely noted. –  Aug 27 '19 at 16:23
1

I had the same problem and I solved it by writing a custom setuptools build command that copied the pre-existing built pyd. It just seemed cleaner to build with CMake and package with setuptools, rather than trying to build with setuptools.

The package is on pypi

The code is on github

Hope this is helpful.