0

I want to build a cython class with a variable number of memoryviews. I want to be able to shuffle the memoryviews and reorder them without copying large amounts of data. I want to keep the benefits of determining at compile type the base type of the arrays (double).

I have not managed to do this efficiently so far.

Here is baseline class definition:

import numpy as np
cimport numpy as np

cdef class Container:
    cdef double[:] ary
    def __cinit__(self, long n):
        self.ary = np.zeros(n, dtype = np.float)
    cpdef void fill(self, double v):
        cdef long i
        for i in range(len(self.ary)):
            self.ary[i] = i+v

When I run a speed benchmark as follows:

a = Container(1000)
%timeit a.fill(10)

I get 550 ns per run of fill.

Now I want to have several memoryviews inside my class, so I tried instead:

import numpy as np
cimport numpy as np

cdef class Container2:
    cdef object[:] ary

    def __cinit__(self, long n, long m):
        cdef Py_ssize_t i
        self.ary = np.empty(n, dtype = np.object)
        for i in range(n):
            self.ary[i] = np.zeros(m, dtype=np.float)
    cpdef void fill(self, double v):
        cdef long i, j
        cdef double[:] sl
        for i in range(len(self.ary)):
            sl= self.ary[i]
            for j in range(len(self.ary[i])):
                sl[j] = j+v

Container2 is the same as Container, except it holds a variable number of typed memory views instead of just one.

When I run a speed benchmark again using

b = Container2(10, 1000)
%timeit b.fill(10)

I now get 9.5 µs per run, so to compare with the above, this translates into 950 ns per run on one array of size 1000.

So the processing time has doubled.

Surely there must be a better and more efficient way.

Chris
  • 387
  • 1
  • 8
  • 1
    You're probably better just making `ary` a list rather than a memoryview with object type. Internally they'll actually be very similar, but a list is a simpler representation of what you want. I doubt the performance will be _that_ different though. – DavidW May 17 '20 at 13:11
  • 1
    Alternatively if the `double[:]` are all the same size you could look at [the indirect buffer layour](https://stackoverflow.com/search?q=%5Bcython%5D+indirect+buffer) - you'd have to implement it yourself, but https://stackoverflow.com/questions/10465091/assembling-a-cython-memoryview-from-numpy-arrays/12991519#12991519 and https://stackoverflow.com/questions/53950020/cython-understanding-a-typed-memoryview-with-a-indirect-contignuous-memory-layo/53951543#53951543 look potentially useful – DavidW May 17 '20 at 13:15
  • @DavidW Thanks. I tried that. Actually I get a marginal performance boost of 6-7 % using a list and iterating over the list which is quite unexpected to me. – Chris May 17 '20 at 13:51
  • @DavidW (response to your second comment). This link is quite an eye opener for me. Did not know that indirect layouts were supported in cython. Many thanks for pointing this out for me!! – Chris May 17 '20 at 14:06
  • About the `list` - it's worth remembering that Numpy has code to account for things like multiple dimensions, so for a simple 1D case a list can probably skip that. Good luck with the indirect layouts - I think it is the tool you want, but I can't claim to understand them well enough to advise further. – DavidW May 17 '20 at 14:48
  • @user2116486 I have implemented buffer protocol for indirect memory layouts, which might be useful for you: https://github.com/realead/indirect_buffer at least as a place to start out. Probably `BufferCollection2D`(+ some extension) from the above library is what you need . – ead May 18 '20 at 08:20

0 Answers0