Let's say I want to benchmark two competing implementations of some function double a(double b, double c)
. I already have a large array <double, 1000000> vals
from which I can take input values, so my benchmarking would look roughly like this:
//start timer here
double r;
for (int i = 0; i < 1000000; i+=2) {
r = a(vals[i], vals[i+1]);
}
//stop timer here
Now, a clever compiler could realize that I can only ever use the result of the last iteration and simply kill the rest, leaving me with double r = a(vals[999998], vals[999999])
. This of course defeats the purpose of benchmarking.
Is there a good way (bonus points if it works on multiple compilers) to prevent this kind of optimization while keeping all other optimizations in place?
(I have seen other threads about inserting empty asm
blocks but I'm worried that might prevent inlining or reordering. I'm also not particularly fond of the idea of adding the results sum += r;
during each iteration because that's extra work that should not be included in the resulting timings. For the purposes of this question, it would be great if we could focus on other alternative solutions, although for anyone interested in this there is a lively discussion in the comments where the consensus is that +=
is the most appropriate method in many cases. )