When compiling the benchmark code below with -O3
I was impressed by the difference it made in latency so i began to wonder whether the compiler is not "cheating" by removing code somehow. Is there a way to check for that? Am I safe to benchmark with -O3
? Is it realistic to expect 15x gains in speed?
Results without -O3
: Average: 239 nanos Min: 230 nanos (9 million iterations)
Results with-O3
: Average: 14 nanos, Min: 12 nanos (9 million iterations)
int iterations = stoi(argv[1]);
int load = stoi(argv[2]);
long long x = 0;
for(int i = 0; i < iterations; i++) {
long start = get_nano_ts(); // START clock
for(int j = 0; j < load; j++) {
if (i % 4 == 0) {
x += (i % 4) * (i % 8);
} else {
x -= (i % 16) * (i % 32);
}
}
long end = get_nano_ts(); // STOP clock
// (omitted for clarity)
}
cout << "My result: " << x << endl;
Note: I am using clock_gettime
to measure:
long get_nano_ts() {
struct timespec ts;
clock_gettime(CLOCK_MONOTONIC, &ts);
return ts.tv_sec * 1000000000 + ts.tv_nsec;
}