0

I am trying to write a run length encoding arithmetic library for Python in Cython. Below you see how the declaration and the important parts of the hot loop of the algorithm looks. It has two places with much and moderate Python interaction, line 73-74 and 77. The C code generated for the heavy Python interaction part is shown in a picture at the end. I will only inquire about how to solve 73-74 here as I imagine the fix for 77 will be similar.

As you can see, there is 1) a lot of type casting in the generated C code, 2) it uses richcompare and 3) getitemint. I do not understand why: 1) The types should be that same, 2) the comparison should be possible at a C level as they are just comparing the same types of numbers and 3) getitem should be superfluous as you are just looking up an index in C array.

How can fix this to optimize my code? Is the problem that the numpy array declarations create Python objects and that I need to give a pointer to them in some way?

enter image description here

Here you see the C code Cython generated for the two dark and light yellow places in my hot loop:

enter image description here

The Unfun Cat
  • 29,987
  • 31
  • 114
  • 156

1 Answers1

5

You haven't typed nvs or nrs so they're treated as Python objects (and hence nv must be converted to a Python object for the comparison).

Do:

cdef long[:] nrs = np.zeros( # ... as before
cdef double[:] nvs = np.zeros( # ... as before

(Also, while the images of the html are helpful, it would have been much easier to read if you'd included the code as text too...)

DavidW
  • 29,336
  • 6
  • 55
  • 86
  • Yes, I was thinking about that just now; also images aren't google/hooliable. – The Unfun Cat Apr 08 '18 at 08:41
  • 1
    I definitely get the value of the highlighting though. But for me it's mostly that they aren't copyable - I can't quickly get the code in an editor and change it. – DavidW Apr 08 '18 at 08:43
  • 2
    @TheUnfunCat I would recommend to use `long[::1] nrs=...`, also for the function's signature. This makes clear that the memory is continuous which leads to generation of a more efficient code, see for example https://stackoverflow.com/q/49058949/5769463 – ead Apr 08 '18 at 18:56