Benchmark code - dividing by the number of iterations or not?

Question

I had an interesting discussion with my friend about benchmarking a C/C++ code (or code, in general). We wrote a simple function which uses getrusage to measure cpu time for a given piece of code. (It measures how much time of cpu it took to run a specific function). Let me give you an example:

const int iterations = 409600; 
double s = measureCPU(); 
for( j = 0; j < iterations; j++ )
        function(args); 
double e = measureCPU(); 
std::cout << (e-s)/iterations << " s \n";

We argued, should we divide (e-s) by the number of iterations, or not? I mean, when we dont divide it the result is in acceptable form (ex. 3.0 s) but when we do divide it, it gives us results like 2.34385e-07 s ...

So here are my questions:

should we divide (e-s) by the number of iterations, if so, why?
how can we print 2.34385e-07 s in more human-readable form? (let's say, it took 0.00000003 s) ?

should we first make a function call for once, and after that measure cpu time for iterations, something like this:

// first function call, doesnt bother with it at all
function(args); 
// real benchmarking
const int iterations = 409600; 
double s = measureCPU(); 
for( j = 0; j < iterations; j++ )
            function(args); 
double e = measureCPU(); 
std::cout << (e-s)/iterations << " s \n";

When I studied algorithms it was noted to not rely on computation speed as that varies from machine to machine but instead to focus on the Order instead be determining the run time complexity. — null, Jul 15 '13 at 11:42
one way is to multiply the divided number by 10^6 to get a unit microseconds per iteration — nio, Jul 15 '13 at 11:42
With an execution time of less than 500ns per function call you might measure a lot of overhead from the loop etc. If you really need a performance count for "function()" you should consider using a profiler instead of manual measurement. — ogni42, Jul 15 '13 at 11:48
IMO The question is a bit strange. Dividing by the loop count is a different question then measuring the exact runtime. It depends what must be measured. If you want to profile code for bottle neck, the average runtime should give a goood enough indication, but the number of function calls is more important. If you have a specific timing requirement you must measure the individual runtime and ensure that all branches are not exceeding your requirement. So the question is, what do you want to measure. — Devolus, Jul 15 '13 at 11:52
@Devolus: thanks! I want to measure how long of cpu time it took to run my `function`, that's all (it makes some math stuff/computation btw). — mazix, Jul 15 '13 at 11:55
@SteveGreen: actual runtime is pretty relevant in the real world. That's why things like profilers exist... — Oliver Charlesworth, Jul 15 '13 at 12:30
I would consider making an initial call to `function(args)` that is not included in the iteration count & time interval, in order to discount the time it takes for the machine code to be loaded into cpu cache. Also, consider automatically calibrating number of iterations required, by performing initial loop, counting how many loops can be performed in 1 second. And, before the actual performance measurement phase, call `sched_yield` to give you a full timeslice. — Darren Smith, Jul 15 '13 at 12:59

nio · Answer 1 · 2013-07-15T13:42:01.887

if you divide the time by number of iterations, then you'll get iteration independent comparison of run time of one function, the more iterations, the more precise result. EDIT: its an average run time over n iterations.
you can multiply the divided time by 1e6 to get microseconds per one iteration unit (i assume that measureCPU returns secods)
```
std::cout << 1e6*(e-s)/iterations << " s \n";
```
as @ogni42 stated, you are getting an overhead from for loop into your measured time, so you could try to unroll the loop a bit to lower the measurement error, do a 8 to 16 calls each iteration, try different call counts to see how the measured time changes:
```
for( j = 0; j < iterations; j++ ) {
    function(args);
    function(args);
    function(args);
    function(args);
    ...
}
```
What you basically get is a lower is better number. If you wanted higher is better scoring you could measure diferent variations of function and then get the time of the fastest one. This one could score 10 points.
```
score_for_actual_function = 10.0 * fastest_time / time_of_actual_function
```

This scoring is kind of time independent, so you can just compare different function variations and the function can score less than one point... and beware of division by zero :)

About the 1st = so is it the best way to measure a piece of code? — mazix, Jul 15 '13 at 12:02
it's still not machine independent... you would have to try your code on different machines and make an average of those averaged run times of each function variant — nio, Jul 15 '13 at 12:06
ad 1: No, what you get is the the total time for the function plus the overhead of looping. With such a small amount of time spent per function call, the overhead of the loop is most likely not negligible. — ogni42, Jul 15 '13 at 13:02
so should he make his for loop more flat? like calling a function 16 times during one iteration — nio, Jul 15 '13 at 13:21

Benchmark code - dividing by the number of iterations or not?

1 Answers1

Linked