0

I have the following Cython code:

# cython: profile=True
import cProfile
from cython import parallel
from libc.stdio cimport FILE, fopen, fclose, fwrite, getline, printf
from libc.string cimport strlen
from libcpp.string cimport string
cdef extern from "stdio.h" nogil:
   int mkstemp(char*);

cdef run_io(string obj):
    cdef int i, dump
    cdef size_t len_ = 0
    cdef char* fname = "/tmp/tmpc_XXXXXX"
    cdef char* nullchar = NULL
    cdef char* line = NULL
    cdef string content = b""
    cdef FILE* cfile
    for i in range(10000):
        dump = mkstemp(fname)
        cfile = fopen(fname, "wb")
        fwrite(obj.data(), 1, obj.size(), cfile)
        fclose(cfile)
        cfile = fopen(fname, "rb")
        while True:
            if getline(&line, &len_, cfile) == -1:
                break
            else:
                content.append(line)
        fclose(cfile)

def run_test():
    cdef string obj = b"abc\ndef"
    cProfile.runctx("run_io(obj)", globals(), locals())

When I try to run it from a python3 console, I get the error:

NameError: name 'run_io' is not defined

If I change in the definition of the run_io function cdef to def, it works:

         7 function calls in 2.400 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    2.400    2.400 <string>:1(<module>)
        2    0.000    0.000    0.000    0.000 stringsource:13(__pyx_convert_string_from_py_std__in_string)
        1    2.400    2.400    2.400    2.400 testc2.pyx:10(run_io)
        1    0.000    0.000    2.400    2.400 {built-in method exec}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.000    0.000    2.400    2.400 {test.run_io}

However, this is not very informative, as I see only the total runtime for the whole function (and I would like to see the partial runtimes for generating filename, reading, writing, etc.).

Therefore, I have two questions:

  1. Is it possible to profile Cython functions (defined with cdef)? If yes, how?

  2. How to make profiling more informative (i.e. measure time spent in each called function)?

J.J. Hakala
  • 6,136
  • 6
  • 27
  • 61
Roman
  • 2,225
  • 5
  • 26
  • 55

1 Answers1

0

python and cpython profiling are deterministic, meaning they work by catching the wall-time at entry and exit of functions, but only the functions the profiler has been told to profile. They cannot give you line-by-line timing information unless they also catch the time before and/or after every line of code.

One way to get line-by-line information, if you don't mind giving up measurement precision (which isn't all it's cracked up to be anyway), is to take a small number of stack samples. Every line of code has inclusive time (cumtime) as a fraction of total time, and that is exactly the fraction of time it is on the stack. So when you take a stack sample, that fraction is the probability that you will see it. If you look at 10 or 20 samples, you get a good idea of which lines of code take a lot of time. Here's an example.

:) An objection I sometimes hear is "Won't that completely slow down the program and invalidate the results?" OK, think about it. There's a line of code, and it takes some fraction of time, like 36.5%, so it is on the stack for that fraction of time. Now you start the program, and 9 seconds later you interrupt it to look at the stack. There's a 36.5% chance that line is on the stack. Now, the doorbell rings, and you don't get back to look at it until one week later. Does that week-long delay change the probability that that line is on the stack sample? Of course not. The computer is patient. That stack sample is un-changed no matter how long you take to look at it.

Community
  • 1
  • 1
Mike Dunlavey
  • 40,059
  • 14
  • 91
  • 135