0

I'm using Cython, to speed up my CPU-intensive Python program. However, my pythonic program makes excessive use of NumPy functions, and I'd want to reduce the amount of C-Python interaction. The code for this example is shown below.

import numpy as np
cimport numpy as cnp

 
def  test_func():
     cdef cnp.ndarray[cnp.float64_t, ndim=2] a
     cdef unsigned int i, j
 
 
     a = np.empty((100, 100), dtype=np.float64)
     cdef double s = 0
     for i in range(100):
         for j in range(100):
             s += a[i][j]**78 * 5

The code executes correctly, however the 'cython -a test.pyx' command highlights the lines 'a = np.empty((100, 100), dtype=np.float64)' and 's += a[i][j]**78 * 5'. Is there any additional type declaration I'm missing? Furthermore, the performance of this cython code and its Python version is nearly identical.

Raghvender
  • 63
  • 6
  • You previously used `cnp`, can you use that again rather than `np`? – Chrispresso Mar 24 '22 at 15:18
  • 3
    You might try it with a memory view instead of the old cnp.ndarrays, i.e. `cdef double[:, ::1] a = np.empty((100, 100), dtype=np.float64)`. However, the function doesn't make any sense: It doesn't return anything and the `a` is not filled with any values. – joni Mar 24 '22 at 16:33
  • @Chrispresso I tried with cnp, nothing changes. – Raghvender Mar 24 '22 at 22:21
  • @joni Thanks for the help. Now it got twice faster than before. you are right, the function does not make any sense but as said my goal was to benchmark the speed. – Raghvender Mar 24 '22 at 22:31

1 Answers1

3

a[i][j] should become a[i, j] (i.e. "index a single element", instead of "make a slice then index that").

You might also simplify the indexing with the Cython directives boundscheck(False) (which skips the check that your indices are valid, at the cost that an invalid index will cause a hard crash) and wraparound(False) (which means negative indices won't work). Before applying these directives make sure that you understand them and that they are suitable for you. Avoid cargo-cult copy-pasting these directives everywhere (like people seem to want to do).

a = np.empty((100, 100), dtype=np.float64)

requires a Python call. There's no way around it. It's also unlikely to be a real bottleneck (it's outside your loops, for example) so you should probably just live with it.

DavidW
  • 29,336
  • 6
  • 55
  • 86
  • See also https://stackoverflow.com/questions/70821944/what-parts-of-a-numpy-heavy-function-can-i-accelerate-with-cython (which deals with a much more general version of this question) – DavidW Mar 24 '22 at 21:36
  • Thanks. I found the link very informative. – Raghvender Mar 24 '22 at 22:35