68

I am writing a Python module that includes Cython extensions and uses LAPACK (and BLAS). I am open to using either clapack or lapacke, or some kind of f2c or f2py solution if necessary. What is important is that I am able to call lapack and blas routines from Cython in tight loops without Python call overhead.

I've found one example here. However, that example depends on SAGE. I want my module to be installable without installing SAGE, since my users are not likely to want or need SAGE for anything else. My users are likely to have packages like numpy, scipy, pandas, and scikit learn installed, so those would be reasonable dependencies. What is the best combination of interfaces to use, and what would the minimal setup.py file look like that could fetch the necessary information (from numpy, scipy, etc.) for compilation?

EDIT: Here is what I ended up doing. It works on my macbook, but I have no idea how portable it is. Surely there's a better way.

from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext
import numpy
from Cython.Build import cythonize
from numpy.distutils.system_info import get_info

# TODO: This cannot be the right way
blas_include = get_info('blas_opt')['extra_compile_args'][1][2:]
includes = [blas_include,numpy.get_include()]

setup(
    cmdclass = {'build_ext': build_ext},
    ext_modules = cythonize([Extension("cylapack", ["cylapack.pyx"],
                                       include_dirs = includes,
                                       libraries=['blas','lapack'])
                   ])
)

This works because, on my macbook, the clapack.h header file is in the same directory as cblas.h. I can then do this in my pyx file:

ctypedef np.int32_t integer

cdef extern from "cblas.h":
    double cblas_dnrm2(int N,double *X, int incX)
cdef extern from "clapack.h":
    integer dgelsy_(integer *m, integer *n, integer *nrhs, 
    double *a, integer *lda, double *b, integer *ldb, integer *
    jpvt, double *rcond, integer *rank, double *work, integer *
    lwork, integer *info)
jcrudy
  • 3,921
  • 1
  • 24
  • 31

1 Answers1

6

If I have understood the question correctly, you could make use of SciPy's Cython wrappers for BLAS and LAPACK routines. These wrappers are documented here:

As the documentation states, you are responsible for checking that any arrays that you pass to these functions are aligned correctly for the Fortran routines. You can simply import and use these functions as needed in your .pyx file. For instance:

from scipy.linalg.cython_blas cimport dnrm2 
from scipy.linalg.cython_lapack cimport dgelsy 

Given that this is well-tested, widely-used code that runs on different platforms, I'd argue that it is a good candidate for reliably distributing Cython extensions that directly call BLAS and LAPACK routines.


If you do not want your code to have a dependency on the entirety of SciPy, you can find many of the relevant files for these wrapper functions in SciPy's linalg directory here. A useful reference is these lines of setup.py which list the source and header files. Note that a Fortran compiler is required!

In theory it should be possible to isolate only the source files here that are needed to compile the BLAS and LAPACK Cython wrappers and then bundle them as an independent extension with your module.

In practice this is very fiddly to do. The build process for the linalg submodule requires some Python functions to aid the compilation on different platforms (e.g. from here). Building also relies upon other C and Fortran source files (here), the paths of which are hard-coded into these Python functions.

Clearly a lot of work has gone into making sure that SciPy compiles properly on different operating systems and architectures.

I'm sure it is possible to do, but after shuffling files about and tweaking paths, I have not yet found the right way to build this part of the linalg submodule independently from the rest of SciPy. Should I find the correct way, I'll be sure to update this answer.

Alex Riley
  • 169,130
  • 45
  • 262
  • 238
  • 1
    My impression is that the question is more about how to link to the blas that comes with scipy/numpy and have it work on multiple computers (with scipy/numpy installed) without having to recompile on each computer. But the advice of using the scipy provided wrappers is good. – DavidW Feb 19 '17 at 16:00
  • Ah, you may be correct there. If the OP provides further clarification I can refine or remove this answer as necessary (or convert it to a comment). – Alex Riley Feb 19 '17 at 16:33
  • This is a great way to do it. @DavidW is correct that I was looking for a solution that's portable and doesn't require users to install additional libraries. I think this is those things with an appropriate setup.py file. ajcr, can you add an example setup.py showing how to get any necessary include directories and such? I think numpy.get_include() is all that's needed? – jcrudy Feb 19 '17 at 19:40
  • @jcrudy I may have misunderstood slightly then: you're looking for something that compiles portably (which I think this answer is good for) rather than something that lets you move the compiled library around portably (which I'm not sure whether it's likely to work)? If that's the case then I think this answer has most of what you need (except possibly setup.py) – DavidW Feb 19 '17 at 20:10
  • It's also worth noting that this is a newish feature and I don't think it was around back when you asked the question (not that that matters) – DavidW Feb 19 '17 at 20:11
  • @DavidW You're correct on both points. I was looking for a solution that compiles anywhere numpy and scipy are installed (and an appropriate compiler and such, of course), and scipy didn't have the cython_blas and cython_lapack modules at the time. The project I was working on at the time is still around, and I actually switched it over to this new system fairly recently. It's a much nicer system. – jcrudy Feb 20 '17 at 01:24
  • And yes, I think this solution has everything except the setup.py, which would make it complete. – jcrudy Feb 20 '17 at 01:31
  • I'll look to add a setup.py example as soon as possible (the beginning of this week is a little hectic, but I should have a chance in a few days). – Alex Riley Feb 20 '17 at 08:05
  • @jcrudy: unfortunately, building only the Cython BLAS/LAPACK wrappers part of SciPy is trickier than I'd initially hoped and I have not yet been successful. I've updated my answer to acknowledge this. I'm sure there must be a way and I will update my answer if I find it! – Alex Riley Feb 24 '17 at 23:01
  • @ajcr I think the entirety of scipy is a pretty reasonable build dependency. I realize my original question only says numpy, but really the idea was to only depend on stuff that users of, say, scikit-learn, are likely to already have installed. The point is to make installing easy. I'll edit my question to match. – jcrudy Feb 25 '17 at 22:19
  • @jcrudy: ah, I see. In that case, simply installing SciPy is by far the easiest way to make these wrappers available. Do let me know if there's anything more you'd like to put into my answer. – Alex Riley Feb 26 '17 at 10:23
  • @ajcr I think all it needs is an example setup.py that does whatever is necessary to compile a cython extension that uses cython_blas/cython_lapack. Or, if nothing special needs to happen in the setup.py, just point that out in the answer. I have a working example here: https://github.com/scikit-learn-contrib/py-earth. However, that's a complicated project and I don't really know what parts are necessary to use the blas and lapack functionality. – jcrudy Feb 28 '17 at 07:35
  • 2
    For those reading this later, nothing special needs to be done in the `setup.py` file to use the wrappers from SciPy. – IanH Jul 30 '18 at 22:03