3

TL;DR How do you get distutils/setuptools to include a non-pure data file correctly?

I've got a project that does some non-trivial code generation using a custom toolchain, and then wraps that generated code with SWIG, and finally builds Python extensions. Cmake encapsulates all this excellently and in the end I have a single folder in the project's root that works exactly as any other Python package.

I'd like to have a simple setup.py so I can wrap this package into a wheel and ship it off to PyPI so that normal Python users don't have to deal with the build process. There are plenty of answers on SO for how to force setuptools to generate a non-pure wheel, and then you can bundle the extensions by using the package_data field or a MANIFEST.in file.

The problem is this method results in a malformed wheel, because the extensions get included under purelib instead of the root directory (where they belong in a Root-Is-Pure: False wheel). Some tools and distros rely on this seperation being correct.

Answers I'm not interested in: Custom extensions to run cmake from within setup.py (Don't want to add another layer of indirection for configuring the project, don't want to maintain it when build options change), modifying the generated wheel, and I'd prefer to avoid adding any more files to the project root than just setup.py

nickelpro
  • 2,537
  • 1
  • 19
  • 25
  • Python extensions should be built within the setup script, with the sources and config provided via `ext_modules`. – hoefling Aug 15 '20 at 11:19
  • @hoefling Unfortunately that would require maintaining a custom setup.py build extension to run the scripts that generate the code, duplicating the job cmake and Ninja already do much better. I explicitly mentioned that as a non-starter in the question. – nickelpro Aug 16 '20 at 06:46
  • So IIUC you want to tell the setup script that you are building a wheel with extensions without providing one and packaging a prebuilt shared object as package data? This will also require maintaining custom `setup.py` hacks, which is IMO a lot worse than extending `build_ext`. If CMake is more suitable to you, I'd suggest extending the build instructions with a custom target that will assemble the wheel, writing wheel metadata and zipping the contents. – hoefling Aug 16 '20 at 12:16
  • @hoefling Sure but the hack for that is trivial, the subclass of the Distribution is three lines of python (https://stackoverflow.com/a/36886459/1201456). If you have a similarly trivial hack, well that's what this question is all about. Right now I have a working-ish solution in about 20 lines of Python, which is worlds less complex than trying to assemble a wheel with cmake or teach setup.py about cmake. – nickelpro Aug 16 '20 at 12:22

1 Answers1

0

This works. distutils and setuptools have to be some of the worst designed pieces of central Python infrastructure that exist.

from setuptools import setup, find_packages, Extension
from setuptools.command.build_ext import build_ext
import os
import pathlib
import shutil

suffix = '.pyd' if os.name == 'nt' else '.so'

class CustomDistribution(Distribution):
  def iter_distribution_names(self):
    for pkg in self.packages or ():
      yield pkg
    for module in self.py_modules or ():
      yield module

class CustomExtension(Extension):
  def __init__(self, path):
    self.path = path
    super().__init__(pathlib.PurePath(path).name, [])

class build_CustomExtensions(build_ext):
  def run(self):
    for ext in (x for x in self.extensions if isinstance(x, CustomExtension)):
      source = f"{ext.path}{suffix}"
      build_dir = pathlib.PurePath(self.get_ext_fullpath(ext.name)).parent
      os.makedirs(f"{build_dir}/{pathlib.PurePath(ext.path).parent}",
          exist_ok = True)
      shutil.copy(f"{source}", f"{build_dir}/{source}")

def find_extensions(directory):
  extensions = []
  for path, _, filenames in os.walk(directory):
    for filename in filenames:
      filename = pathlib.PurePath(filename)
      if pathlib.PurePath(filename).suffix == suffix:
        extensions.append(CustomExtension(os.path.join(path, filename.stem)))
  return extensions

setup(
  # Stuff
  ext_modules = find_extensions("PackageRoot"),
  cmdclass = {'build_ext': build_CustomExtensions}
  distclass = CustomDistribution
)

I'm copying extensions into the build directory, and that's it. We override the distribution to lie to the egg-info writers about having any extensions, and everything is gravy.

nickelpro
  • 2,537
  • 1
  • 19
  • 25
  • `top_level.txt` is generated when the `easy_install` command is executed (IIRC it is called by `install`). The relevant snippet: [`setuptools.command.easy_install`](https://github.com/pypa/setuptools/blob/f991fbb3c9d0e10a0a78ae2b508b3fd99f9cdef2/setuptools/command/easy_install.py#L1034-L1040) – hoefling Aug 16 '20 at 12:50
  • @hoefling That's the top_level.txt that's written to the install directory when using the easy_install command to install an exe as an egg on Windows (and easy_install is deprecated anyway). It has nothing to do with the one generated by distutils when building a wheel, believe me I've tried modifying that exact function just to make sure. I agree that is the only place in distutils, setuptools, and the wheel build extension where "top_level" is mentioned. Now you know my frustration. – nickelpro Aug 16 '20 at 13:11
  • Your comment spurred me to find the writer, which I did, [here](https://github.com/pypa/setuptools/blob/f991fbb3c9d0e10a0a78ae2b508b3fd99f9cdef2/setuptools/command/egg_info.py#L661-L668). It's pulling from `iter_distribution_names`, which maybe I can modify into getting the desired behavior – nickelpro Aug 16 '20 at 13:31
  • 1
    Yep, my bad - looks like it's the [`write_top_level_names`](https://github.com/pypa/setuptools/blob/f991fbb3c9d0e10a0a78ae2b508b3fd99f9cdef2/setuptools/command/egg_info.py#L661-L668) function. It is registered via an [entry point](https://github.com/pypa/setuptools/blob/f991fbb3c9d0e10a0a78ae2b508b3fd99f9cdef2/setup.py#L171), thus no direct mentioning of it in the source code. Overall, this was my point - the task of overriding distutils to achieve custom behaviour is often so tedious that it's just not worth it, in the end. Edit: you got the writer first, though :-) – hoefling Aug 16 '20 at 13:32
  • Yes, but now we've got it, just need to override the `iter_distribution_names` method to not return extensions and we're done. 25 lines of code and the desired behavior is achieved, short and simpler than anything else. If distutils and setuptools weren't so heavily over designed it would be trivial to figure out what those 25 lines were. – nickelpro Aug 16 '20 at 13:54