Initialise Cython Memoryview efficiently

Question

I'm currently setting my MemoryViews in my Cython pyx file as follows:

@cython.boundscheck(False)
cdef int[:] fill_memview():
    # This happens inside a big loop so needs to be fast
    cdef int[:] x = np.empty(10)
    for i in range(10):
        x[i] = i
    return x

cdef stupid_loop():
    for i in range(10000):
        fill_memview()

When I compile the pyx file with cython -a foo.pyx the line cdef int[:] x = np.empty(10) shows in the resulting annotated html file in dark yellow (meaning it has lots of Python calls slowing things down.)

How can I instatiate my typed Memoryview better?

score 3 · Accepted Answer · edited May 23 '17 at 12:09

See this answer for a comparison of different ways of allocating memory. If your needs are simple (just indexing) pay particular attention to 'cpython.array raw C type', you can create a cpython array for fast creation then use as_ints[i] for fast unsafe indexing, or if you really do need a memory view, the memory view on a cpython array is 3x faster than a numpy array.

Without having a bigger picture of what your code does it's difficult to offer more specific advice. For example if possible you would be better served using a two-dimensional array as it tends to be much more efficient to allocate one large chunk of memory than lots of little ones, and for instance it's much quicker to make lots of little memory view slices of one big memory view with a big hunk of allocated memory, than to create a bunch of little memory views each with their own little piece of allocated memory.

score 1 · Answer 2 · answered Nov 10 '14 at 14:45

1

Your memoryview is slow(er than strictly necessary) because Python needs to reference-count it. You can allocate the memory by hand using the Python/C API, but then you're responsible for freeing it when you no longer need it.

Don't do this unless you've used a profiler and are seeing an unacceptable amount of refcounting overhead. Premature optimization is never a good idea, and it's easy to introduce memory leaks or segfaults with this method.

answered Nov 10 '14 at 14:45

Kevin

28,963
9
62
81

Can you explain what you mean by "premature optimization"? – LondonRob Nov 10 '14 at 14:58
You are trying to make your code faster, but until you measure performance with a profiler, you don't know that it's spending much time executing this code. The fact that it happens to be in a tight loop means it *might* be, but without empirical data it's difficult to be certain. Perhaps the O(1) Python overhead is dominated by the O(n) initialization (i.e. the `x[i] = i` line). – Kevin Nov 10 '14 at 15:01

Initialise Cython Memoryview efficiently

2 Answers2

Linked