9

I've started on my first somewhat serious programming project with PHP and am having a hard time getting good profiling information. I have used xdebug and cachegrind to profile the code, but it assumes that a function's "self" execution time is the total time the function takes to execute minus the aggregated time of any function(s) it calls.

While this is correct in essence, it does not help as much as it could when I'm trying to optimize the code. For example, I have a simple function with two if statements which returns an array with two values and it takes 26% of the total execution time according to xdebug. It uses array_key_exists, and I can see that the execution time of those function calls is around 1%. What I cannot see is how other 25% of the execution time is accounted for with the rest of the code. How long did it take for the if statements, getting the pair of values, putting those values into an array, and returning the array?

Is there some xdebug flag which I missed in the documentation? Or is there a profiling tool for PHP which will profile variable assignments, operators, the construction of the basic data types, and other non-functional calls? Preferably, something that does not assume that the script is available via a web server.

An example of something along the lines of what I'm looking for is nytprof for perl. Note that it profiles every single line of code within the function (besides the last push, which never gets executed).

Ogo
  • 99
  • 2

2 Answers2

4

If you want to measure per line statistics I'd suggest using a combination of declare (for tick), register_tick_function, microtime() and debug_backtrace. This lets you call arbitrary functions after every line of execution w/o altering the PHP source.

For example:

declare(ticks=1);
register_tick_function( function() {
    print "call\n";
} );

print "hi\n";
print "bye\n";

Actually runs like this:

declare(ticks=1);
register_tick_function( function() {
    //print "call\n";
} ); print "call\n";

print "hi\n"; print "call\n";
print "bye\n"; print "call\n";

Unfortunately your on your own if you go this route. You'll have to log the data yourself, analyze and then figure out an way of correlating it with the source. It's not impossible but as far as I know these tools don't already exist for PHP.

Proof-of-concept:

function _line_benchmark( $reset = FALSE ) {
    $start_time = microtime(TRUE);
    static $lastFrame = NULL;

    if( $reset ) {
        $lastFrame = NULL;
        return;
    }

    $backtrace = debug_backtrace();
    $last_backtrace = $backtrace[count($backtrace) -1];
    $file = $last_backtrace["file"];
    $line = $last_backtrace["line"];
    $source = file_get_contents( $file );
    $source_lines = preg_split( "/((\r(?!\n))|((?<!\r)\n)|(\r\n))/", $source );

    if( ! is_null( $lastFrame ) ) {
        if( $lastFrame["line"]+1 <= $line ) {
            print "{\n";
            for( $i = $lastFrame["line"]+1; $i <= $line; $i++ ) {
                print "#".($i-1)."\t".$source_lines[$i-1]."\n";
            }
            print "} - ".($start_time-$lastFrame["time"])."\n\n";
        }
    }

    $lastFrame = array(
        "time" => microtime(TRUE),
        "file" => $file,
        "line" => $line
    );
}

function line_benchmark_start() {
    _line_benchmark( TRUE ); //reset benchmark functions

    declare(ticks=1);
    register_tick_function( "_line_benchmark" );
}

function line_benchmark_stop() {
    unregister_tick_function( "_line_benchmark" );
}


line_benchmark_start();

usleep( 100 );
usleep( 1000 );
usleep( 10000 );
usleep( 100000 );
usleep( 1000000 );

line_benchmark_stop();

Output

{
#48     
#49     usleep( 100 );
} - 0.000154972076416

{
#50     usleep( 1000 );
} - 0.00112199783325

{
#51     usleep( 10000 );
} - 0.0101318359375

{
#52     usleep( 100000 );
} - 0.0998418331146

{
#53     usleep( 1000000 );
} - 0.999831914902
Kendall Hopkins
  • 43,213
  • 17
  • 66
  • 89
3

When you're running the program under xdebug, you should be able to break into it's execution manually at a random time, and then display the call stack. If you do that several times, any code that's responsible for significant time will appear on those stack samples. That's the random-pause technique.

It is a bit of a brain-shift. Things like self-time, measuring function execution time, or counting calls, for the purpose of finding problems, are indirect information at best. Rather, it says anything that's actually costing much time can't escape your attention if you just drop in on it and take a look. Anything you see it "doing" (and that means any line of code on the stack that you catch more than once) is something that will give a good speedup if you can avoid doing it. Precision of measurement is irrelevant. You've found it, so you can fix it, and afterwards measure the speedup.

Community
  • 1
  • 1
Mike Dunlavey
  • 40,059
  • 14
  • 91
  • 135
  • After looking over your suggestion a few times and trying to wrap my mind around it (your correct that it is a brain-shift), I think the random-pause technique is quite similar to running cachegrind to profile the application. – Ogo Mar 26 '11 at 18:00
  • Too slow to edit: Regardless of the tool or technique, the goal is to identify what operations are taking time which I think both methods do fairly well. What I was looking for was a way to make the profiler give more granular information to better define the problem. Having this information using a profiler would make optimizing the code easier (no instrumentation). It seems that random-pause can do that, but appears to be more labor intensive as well. – Ogo Mar 26 '11 at 18:14
  • 1
    @Ogo: You might be surprised. You have a function F that xdebug says takes 26% of time, but what it calls G only takes 1%, so you're puzzled. Ok, then 10 samples should show F on 2.6 of them, and they will also show the statement inside F where it is during that time. So I know it sounds more like debugging than running a tool and puzzling over the output, but you'll still have the answer before the other tool is even installed and had its manual read. [Here's another take on it.](http://stackoverflow.com/questions/4832642/when-is-optimization-premature/4832698#4832698) – Mike Dunlavey Mar 26 '11 at 19:36