38

I often struggle to find bottlenecks in my cython code. How can I profile cython functions line-by-line?

Till Hoffmann
  • 9,479
  • 6
  • 46
  • 64

3 Answers3

47

Robert Bradshaw helped me to get Robert Kern's line_profiler tool working for cdef functions and I thought I'd share the results on stackoverflow.

In short, set up a regular .pyx file and build script and add the following before your call to cythonize.

# Thanks to @tryptofame for proposing an updated snippet
from Cython.Compiler.Options import get_directive_defaults
directive_defaults = get_directive_defaults()

directive_defaults['linetrace'] = True
directive_defaults['binding'] = True

Furthermore, you need to define the C macro CYTHON_TRACE=1 by modifying your extensions setup such that

extensions = [
    Extension("test", ["test.pyx"], define_macros=[('CYTHON_TRACE', '1')])
]

A working example using the %%cython magic in the iPython notebook is here: http://nbviewer.ipython.org/gist/tillahoffmann/296501acea231cbdf5e7

Till Hoffmann
  • 9,479
  • 6
  • 46
  • 64
  • Very helpful, thanks. A detail: I've found that line_profiler states that the file profiled is the original .pyx file. I'm pretty sure I have everything pointing at the .pyd, so I suspect that the profiler just reads the content from the .pyd for display, while still getting the actual timings from the compiled version. – Giswok Dec 01 '15 at 17:18
  • For reference, here are the [Cython docs on line tracing](http://docs.cython.org/src/tutorial/profiling_tutorial.html#enabling-line-tracing). – 0 _ Dec 06 '15 at 12:06
  • There's also a more complete non-iPython example here: https://github.com/cython/cython/blob/master/tests/run/line_profile_test.srctree – deef Dec 23 '16 at 00:23
  • 5
    Has anyone tried this solution without using notebooks ? I tried bu it just ignores the cythonized code. Also if I try to decorate the function with ``@profile`` I'm not able do compile the file with disutils returning ``undeclared name not builtin: profile`` – Pedro Braz Aug 28 '17 at 19:11
  • I've asked this question [here](https://stackoverflow.com/questions/45925921/cannot-use-line-profiler-with-cython) – Pedro Braz Aug 28 '17 at 19:15
  • 2
    Notice this changed in recent versions: https://github.com/cython/cython/issues/1497#issuecomment-256400972 – Pietro Battiston Jan 31 '18 at 15:55
  • 2
    @%%cython magic in iPython notebook use the following code since "from Cython.Compiler.Options import directive_defaults" is deprecated import Cython directive_defaults = Cython.Compiler.Options.get_directive_defaults() – tryptofame Nov 12 '18 at 15:24
  • 4
    directive_defaults = Cython.Compiler.Options.get_directive_defaults() # since "from Cython.Compiler.Options import directive_defaults" seems to be deprecated – tryptofame Jul 03 '19 at 12:59
  • Since Cython 0.26 extensions contain relative paths to source files, so `line_profiler`/`kernprof` [can be unable to display them](https://github.com/cython/cython/issues/3929). You can workaround this by setting a drive root or a sibling folder as a working directory for cythonization. – EvgenKo423 Dec 03 '20 at 10:49
  • Just a note that you MUST use `def`, I used `cpdef` and the timing results showed up empty. – TimD1 Oct 14 '21 at 04:10
  • 2
    As of 2023 this doesn't seem to be working at all. I made a new notebook which is basically the same as the old, though with the updated get_directive_defaults() thing. It all runs, but the actual output is simply `Timer unit: 1e-09 s`. Is there any way forward at all? https://nbviewer.org/gist/battaglia01/f138f6b85235a530f7f62f5af5a002f0?flush_cache=true – Mike Battaglia Feb 11 '23 at 13:33
  • @MikeBattaglia : I'm running into the same issue :-( – G. Fougeron May 11 '23 at 07:21
  • @G.Fougeron the problem is the latest version of line_profiler silently dropped support for Cython. The older version (3.5.1, I think?) still has it! Also you can try, e.g. py-spy or other profilers – Mike Battaglia May 11 '23 at 08:27
  • Thanks for the tip, I'll try line_profiler v3.5.1 then! – G. Fougeron May 11 '23 at 09:15
  • Do you know whether there's an issue open for Cython support being dropped? – Till Hoffmann May 12 '23 at 18:17
9

While @Till's answer shows the way for profiling Cython-code using setup.py-approach, this answer is about ad-hoc profiling in IPython/Jupiter notebook and is more or less "translation" of Cython-documentation to IPython/Jupiter.

%prun-magic:

If %prun-magic should be used, then it is enough to set the Cython's compiler directive profile to True (here with example from Cython-documentation):

%%cython
# cython: profile=True

def recip_square(i):
    return 1. / i ** 3

def approx_pi(n=10000000):
    val = 0.
    for k in range(1, n + 1):
        val += recip_square(k)
    return (6 * val) ** .5 

Using global directive (i.e. # cython: profile=True) is a better way, than modifying the global Cython-state, because changing it will lead to extension being recompiled (which is not the case if the global Cython state is changed - the old cached version compiled with old global state will be reloaded/reused).

And now

%prun -s cumulative approx_pi(1000000)

yields:

        1000005 function calls in 1.860 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    1.860    1.860 {built-in method builtins.exec}
        1    0.000    0.000    1.860    1.860 <string>:1(<module>)
        1    0.000    0.000    1.860    1.860 {_cython_magic_404d18ea6452e5ffa4c993f6a6e15b22.approx_pi}
        1    0.612    0.612    1.860    1.860 _cython_magic_404d18ea6452e5ffa4c993f6a6e15b22.pyx:7(approx_pi)
  1000000    1.248    0.000    1.248    0.000 _cython_magic_404d18ea6452e5ffa4c993f6a6e15b22.pyx:4(recip_square)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

%lprun-magic

If the line profiler (i.e. %lprun-magic) should be used, then the Cython module should be compiled with different directives:

%%cython
# cython: linetrace=True
# cython: binding=True
# distutils: define_macros=CYTHON_TRACE_NOGIL=1
...

linetrace=True triggers creation of the tracing in the generated C-code and implies profile=True so it must not be set in addition. Without binding=True line_profiler doesn't have the necessary code-information and CYTHON_TRACE_NOGIL=1 is needed, so the line profiling is also activated when compiled with the C-compiler (and not thrown away by the C-preprocessor). It is also possible to use CYTHON_TRACE=1 if nogil-blocks should not be profiled on per-line basis.

Now it can be used for example as follows, passing the functions, which should be line-profiled via -f option (use %lprun? to get info about possible options):

%load_ext line_profiler
%lprun -f approx_pi -f recip_square approx_pi(1000000)

which yields:

Timer unit: 1e-06 s

Total time: 1.9098 s
File: /XXXX.pyx
Function: recip_square at line 5

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     5                                           def recip_square(i):
     6   1000000    1909802.0      1.9    100.0      return 1. / i ** 2

Total time: 6.54676 s
File: /XXXX.pyx
Function: approx_pi at line 8

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     8                                           def approx_pi(n=10000000):
     9         1          3.0      3.0      0.0      val = 0.
    10   1000001    1155778.0      1.2     17.7      for k in range(1, n + 1):
    11   1000000    5390972.0      5.4     82.3          val += recip_square(k)
    12         1          9.0      9.0      0.0      return (6 * val) ** .5

line_profiler has however a minor hiccup with the cpdef-function: it doesn't detect the function body correctly. In this SO-post, a possible workaround is shown.


One should be aware, that profiling (all above line profiling) changes the execution-time and its distribution compared with "normal" run. Here we see, that different times are needed depending on type of profiling, for the same function:

Method (N=10^6):        Running Time:       Build with:
%timeit                 1 second
%prun                   2 seconds           profile=True
%lprun                  6.5 seconds         linetrace=True,binding=True,CYTHON_TRACE_NOGIL=1
ead
  • 32,758
  • 6
  • 90
  • 153
7

Although I wouldn't really call it profiling, there is another option to analyze your Cython code by running cython with -a (annotate), this creates a webpage in which the main bottlenecks are highlighted. For example, when I forget to declare some variables:

enter image description here

After correctly declaring them (cdef double dudz, dvdz):

enter image description here

Bart
  • 9,825
  • 5
  • 47
  • 73
  • 9
    True, not typing your variables will slow your code down. But `-a` will not give you any information about actual runtime but only whether you are making `python` calls. – Till Hoffmann Nov 20 '15 at 17:24
  • But in my case, things like forgetting to declare a variable in porting Python to Cython code is what typically makes the code slow, and its a quick and simple way to test for those things. That's why I called it "_not_ _really_ _profiling_"; it's just a simple first code check/analysis. – Bart Nov 20 '15 at 18:49