27

There are lots of questions about using numpy in cython on this site, a particularly useful one being Simple wrapping of C code with cython.

However, the cython/numpy interface api seems to have changed a bit, in particular with ensuring the passing of memory-contiguous arrays.

What is the best way to write a wrapper function in cython that:

  • takes a numpy array that is likely but not necessarily contiguous
  • calls a C++ class method with the signature double* data_in, double* data_out
  • returns a numpy array of the double* that the method wrote to?

My try is below:

cimport numpy as np
import numpy as np # as suggested by jorgeca

cdef extern from "myclass.h":
    cdef cppclass MyClass:
        MyClass() except +
        void run(double* X, int N, int D, double* Y)

def run(np.ndarray[np.double_t, ndim=2] X):
    cdef int N, D
    N = X.shape[0]
    D = X.shape[1]

    cdef np.ndarray[np.double_t, ndim=1, mode="c"] X_c
    X_c = np.ascontiguousarray(X, dtype=np.double)

    cdef np.ndarray[np.double_t, ndim=1, mode="c"] Y_c
    Y_c = np.ascontiguousarray(np.zeros((N*D,)), dtype=np.double)

    cdef MyClass myclass
    myclass = MyClass()
    myclass.run(<double*> X_c.data, N, D, <double*> Y_c.data)

    return Y_c.reshape(N, 2)

This code compiles but is not necessarily optimal. Do you have any suggestions on improving the snippet above?

and (2) throws and "np is not defined on line X_c = ...") when calling it at runtime. The exact testing code and error message are the following:

import numpy as np
import mywrapper
mywrapper.run(np.array([[1,2],[3,4]], dtype=np.double))

# NameError: name 'np' is not defined [at mywrapper.pyx":X_c = ...]
# fixed!

Community
  • 1
  • 1
Michael Schubert
  • 2,726
  • 4
  • 27
  • 49
  • 5
    You still have to `import numpy as np` in your `.pyx` file to use numpy functions (`cimport numpy as np` ["is used to import special compile-time information about the numpy module"](http://docs.cython.org/src/tutorial/numpy.html#adding-types)). – jorgeca Jul 25 '13 at 11:39
  • @jorgeca I guess your comment answers the OP question... – Saullo G. P. Castro Jul 25 '13 at 12:15
  • 1
    @SaulloCastro I posted it as a comment because I thought it was a minor hurdle, but I don't know what's the best way to write these interfaces. – jorgeca Jul 25 '13 at 12:36
  • @jorgeca Thank you, it was indeed the missing statements that caused the error messages. And you are right, I'm mainly looking for optimisations :-) – Michael Schubert Jul 25 '13 at 12:41

1 Answers1

20

You've basically got it right. First, hopefully optimization shouldn't be a big deal. Ideally, most of the time is spent inside your C++ kernel, not in the cythnon wrapper code.

There are a few stylistic changes you can make that will simplify your code. (1) Reshaping between 1D and 2D arrays is not necessary. When you know the memory layout of your data (C-order vs. fortran order, striding, etc), you can see the array as just a chunk of memory that you're going to index yourself in C++, so numpy's ndim doesn't matter on the C++ side -- it's just seeing that pointer. (2) Using cython's address-of operator &, you can get the pointer to the start of the array in a little cleaner way -- no explicit cast necessary -- using &X[0,0].

So this is my edited version of your original snippet:

cimport numpy as np
import numpy as np

cdef extern from "myclass.h":
    cdef cppclass MyClass:
        MyClass() except +
        void run(double* X, int N, int D, double* Y)

def run(np.ndarray[np.double_t, ndim=2] X):
    X = np.ascontiguousarray(X)
    cdef np.ndarray[np.double_t, ndim=2, mode="c"] Y = np.zeros_like(X)

    cdef MyClass myclass
    myclass = MyClass()
    myclass.run(&X[0,0], X.shape[0], X.shape[1], &Y[0,0])

    return Y
Robert T. McGibbon
  • 5,075
  • 3
  • 37
  • 45
  • 1
    Say, can this be done with a typed memoryview in Cython instead of passing the array? I was not sure if that would save some memory overhead, etc? – krishnab Mar 05 '17 at 23:00