Cython - Class with numpy attributes

Question

I am trying to optimize a python class by using cython extension type. My operations are using numpy without indexing at all i.e. piecewise addition , matrix multiplication, so my primary goal is to reduce python overhead. I am mostly interested in the type declaration: should I use numpy or cython structures? Here's a simple example:


cdef class Analyzer:
    cdef float coeff
    cdef float[:] v1, v2 #should I use memoryview? can I use a ndarray?

    def __init__(self):
        # some stuff ...

    cpdef void func(self, matrix, vector):
        # some stuff ...
        cdef Py_ssize_t i
        for i in range(big_number):
            self.update_vectors(i)
            self.v1 = matrix @ vector + self.coeff * self.v2 # all my operations are like this i.e. no indeces 
            # should I convert views to nd.arrays?

Note, that when I try self.coeff * self.v2, I get errors since they are not considered numpy arrays:

Invalid operand types for '*' (float; float[:] *)

So, since I am using numpy and there's is no point to create loops just for standard linear algebra stuff, how do I declare (class) attribute types to reduce overhead?

It may not be possible right now. See the docs [here](https://cython-docs2.readthedocs.io/en/latest/src/tutorial/numpy.html): "Fast array declarations can currently only be used with function local variables and arguments to def-style functions (not with arguments to cpdef or cdef, and neither with fields in cdef classes or as global variables). " — James, Nov 24 '19 at 21:05
Yes, it's not possible yet, but what I am asking is the most efficient way around it. Should I just avoid type declaration? Or should I try converting between cython arrays and numpy arrays? — consthatza, Nov 24 '19 at 21:14
If you aren't using indexing then there's absolutely no value in typing them. Just set the type as a generic `cdef object`. If you only use indexing in a few functions then you can just create a memoryview of the array inside the function. — DavidW, Nov 24 '19 at 21:30
@DavidW, I see... thanks for the advice. From your experience will the use of `cdef object` be significantly faster than python `object`? I am looking for at least x10 speed up. — consthatza, Nov 24 '19 at 21:36
No. `cdef object` is sometimes around 1.5x faster than running the equivalent code in Python (but could be more or less depending on the size of the array - essentially Cython can call the function slightly faster than Python would, but actual operation is already written in C in Numpy and will take exactly the same time). 10x speed-up seems very unlikely — DavidW, Nov 24 '19 at 21:40
To avoid Python Interpreter overhead you have to call the BLAS routines directly. eg. https://stackoverflow.com/a/50093010/4045774 But the overhead gets negletable if the matrices are getting bigger. — max9111, Nov 25 '19 at 10:14
Unrelated to your question but you might have problems with `cdef float`, I'd prefer using an explicit type, e.g. `np.float64`. You can also use fused types to create definitions that work for multiple different generic floating point types. — ngoldbaum, Nov 25 '19 at 15:52
Thanks for your advice, guys. I am running a Monte Carlo Analysis and that means that methods are called repeatedly thousands of times. That's why I am so fixed about reducing overhead. — consthatza, Nov 26 '19 at 08:45

Cython - Class with numpy attributes

0 Answers0