0

I'm using Cython to compile a function to C, but get a "Unreachable code" warning. When I inspect the pyx file, I see an additional return locals() which I don't quite understand how it got there.

The code is generated by cython.inline:

cython.inline('return a * b + c if a > b else 0.0', a=1, b=2, c=3)

which produces a pyx file that looks like this:

def __invoke(double a, double b, double c):
    return a * b + c if a > b^2 else 0.0
    return locals()

The reason I am cythonizing this function is to improve performance. The above function is a simplification, but the basic elements are the same. Note that the inline function is not using numpy arrays. If anyone can think of a faster way to evaluate the expression, I am happy to try it out (the syntax for the original expression is a bit different, but I can compile it to any format).

Anyway, the main point of this question is to understand why and additional return statement has been added and how to remove it.

Update

This is the overhead I've noticed from the cython.inline calls (refers to conversation with @DavidW).

enter image description here

orange
  • 7,755
  • 14
  • 75
  • 139

1 Answers1

2

I think it's so that if you don't add a return statement you get back a dictionary of local variables. E.g.

cython.inline('''x = a*b
y = b+c
z = a-c
''', a=1, b=2, c=3)

will give you back a dictionary of x, y, and z. Obviously it's a bit unnecessary since you could do that manually yourself, but it makes some use-cases easy (and would break compatibility with existing code if removed).

Cython accomplishes those features by adding return locals() to the end of everything it compiles. You can find it in the source code.

I don't think you can get rid of it, but it also costs you nothing (except a compiler warning) - it's obvious to the C compiler that the code is unreachable so it never gets generated.


To answer your secondary question about improving performance - this kind of calculation presumably only matters if it's called repeatedly? I'd try to get the loop in Cython too if at all possible, otherwise I'd be surprised if you gained much.

DavidW
  • 29,336
  • 6
  • 55
  • 86
  • Thanks that makes sense. I forgot to check the source code for this... Yes, this function is called repeatedly. I noticed a nice speedup by cythonizing it, but there's still a bit of overhead from a few calls in Cython which would be nice to get rid of (although once compiled, the function is cached). I wonder if it's due to the inlining or just the way Cython works. The loop can't be included because the calls happen at different places and also deep down far away from the actual loop (not like a kernel function on some array data). – orange Sep 19 '16 at 18:54
  • I've updated the question with a runsnakerun screenshot of the relevant calls. – orange Sep 19 '16 at 19:01
  • @orange I think you're mostly hitting the stuff to check if it's cached (look at the code around the github link in my answer - it's quite involved!), which ends up being significant for a small function. You'd get better performance if you just made a one line function in a pyx file and compiled it normally. There's still overhead checking types as the function is called, but it's less. – DavidW Sep 19 '16 at 21:37
  • A second option (but it strongly depends on what libraries you use) could be to use PyPy. This might be just the sort of thing that it does well. (Or you could depend on scipy and be mostly stuck...) – DavidW Sep 19 '16 at 21:39
  • Thanks heaps for your pyx suggestion. I was contemplating doing this, but didn't want to deal with all the temporary files/recompilation in case of change issues. But the pyx solution would also allow me to turn off some of the cython checks (speedup) using directives which IMHO isn't currently possible with inline cython functions. Would you know if I could use StringIO buffers to create the pyx, i.e. not use physical files (I don't think so as the C compiler has to work on a file, but I thought I'd ask)? PyPy isn't really an option as there are too many other libraries that aren't supported. – orange Sep 20 '16 at 07:59
  • pyximport might be your best option. It should detect changes and only recompile when needed. I don't know about StringIO - [gcc can work with strings from standard input](http://stackoverflow.com/questions/1003644/is-it-possible-to-get-gcc-to-read-from-a-pipe) but I've got no idea how you'd do that with python/cython. – DavidW Sep 20 '16 at 09:54