We were developing a piece of code that would check whenever the user should not be allowed to get into a sector within a time period, one of my colleagues created a function which in the code below is the isAllowed and contains several comparisons, I took a different approach which is the function isAllowed2 which uses the amount of seconds between time periods.
At first we had no doubt that his function would be faster, but that is not true when actually running the code and comparing the speed, even if the difference is something we can completely ignore, we would like to know why is it that the one that "should" be faster is in fact slower.
Considering the following code:
#include <iostream>
#include <chrono>
using namespace std;
using namespace std::chrono;
struct timing {
short hour;
short minute;
};
bool isAllowed(timing &from, timing &to, timing &actual) {
return !(((from.hour > to.hour && (actual.hour >= from.hour || actual.hour <= to.hour)) ||
(actual.hour >= from.hour && actual.hour <= to.hour)) &&
!(actual.minute > from.minute && actual.minute < to.minute));
}
long getSecs(short hour, short minutes) {
return (hour * 3600) + (minutes * 60);
}
bool isAllowed2(timing &from, timing &to, timing ¤t) {
long secsFrom = getSecs(from.hour, from.minute);
long secsTo = getSecs(to.hour, to.minute);
long secsCurrent = getSecs(current.hour, current.minute);
if (secsFrom > secsTo) secsTo += 24 * 60 * 60;
if (secsCurrent > secsFrom && secsCurrent < secsTo) {
return false;
}
return true;
}
int main() {
//debug messages
std::string okay = " - ok";
std::string error = " - er";
std::string receive = " - allowed";
std::string notReceive = " - denied";
//testing times
int const testDataCount = 5;
timing from[testDataCount] = {
{ 16, 30 },
{ 8, 30 },
{ 10, 30 },
{ 0, 30 },
{ 0, 0 }
};
timing to[testDataCount] = {
{ 8, 30 },
{ 20, 0 },
{ 20, 0 },
{ 6, 0 },
{ 7, 0 }
};
for (int i = 0; i < testDataCount; i++) {
std::cout << i + 1 << ": " << from[i].hour << ":" << from[i].minute << " to " << to[i].hour << ":"
<< to[i].minute << std::endl;
}
//test current times
timing current[5] = {
{ 12, 0 },
{ 23, 0 },
{ 17, 30 },
{ 15, 12 },
{ 0, 20 }
};
bool ergValues[][testDataCount] = {
{ true, false, false, true, true },
{ false, true, true, true, true },
{ false, false, false, true, true },
{ true, false, false, true, true },
{ false, true, true, true, false }
};
long totalNs1 = 0;
long totalNs2 = 0;
for (int i = 0; i < 4; i++) {
std::cout << std::endl << i + 1 << ". Test: " << current[i].hour << ":" << current[i].minute << std::endl;
for (int j = 0; j < testDataCount; j++) {
high_resolution_clock::time_point t1 = high_resolution_clock::now();
bool response = isAllowed(from[j], to[j], current[i]);
high_resolution_clock::time_point t2 = high_resolution_clock::now();
high_resolution_clock::time_point t3 = high_resolution_clock::now();
bool response2 = isAllowed2(from[j], to[j], current[i]);
high_resolution_clock::time_point t4 = high_resolution_clock::now();
long ns1 = duration_cast<std::chrono::nanoseconds>(t2 - t1).count();
totalNs1 += ns1;
long ns2 = duration_cast<std::chrono::nanoseconds>(t4 - t3).count();
totalNs2 += ns2;
std::cout << j + 1 << "\t\t:1:" << ns1 << "ns: " << response << (response == ergValues[i][j] ? okay : error) << "\t\t:2:" << ns2 << "ms: " << response2 << (response2 == ergValues[i][j] ? okay : error) << "\t\t"
<< (ergValues[i][j] ? receive : notReceive) << std::endl;
}
}
std::cout << "\r\ntotalNs1 = " << totalNs1 << "\r\ntotalNs2 = " << totalNs2 << "\r\n\r\n";
return 0;
}
The result would obviously always differ, but no matter what the totalNs2 would always be smaller than the totalNs1.
Ex:
totalNs1 = 38796
totalNs2 = 25913
I tested this on a AMD Phenom II X4 and an Intel i7-3770, both on Windows 10 and Debian 8, with quite similar results.
So finally the question is, why is the function isAllowed2 faster than the isAllowed?
Note: This is mostly a curiosity question, if someone thinks the title or the tags are not the most appropriate please let me know and I'l change them accordingly, please excuse any eventual grammar errors as English is not my native language.
EDIT
Meanwhile I've been researching further based on all the suggestions and comments, including this incredibly detailed answer, after understanding how inaccurate micro-benchmarking can be (a HUGE thanks to Baum mit Augen for linking this amazing talk, which helped a lot) I ended up using the Google microbenchmark library to get more "accurate" results, it seems that the isAllowed function is in fact faster (compiled without optimization) as shown in the output from the library.
Run on (8 X 2395 MHz CPU s)
-----------------------------------------------------------------------
Benchmark Time CPU Iterations
-----------------------------------------------------------------------
BM_isAllowed/2/min_time:2.000 22 ns 22 ns 128000000
BM_isAllowed/4/min_time:2.000 22 ns 22 ns 137846154
BM_isAllowed/8/min_time:2.000 22 ns 22 ns 128000000
BM_isAllowed/16/min_time:2.000 22 ns 22 ns 128000000
BM_isAllowed/22/min_time:2.000 22 ns 22 ns 137846154
BM_isAllowed2/2/min_time:2.000 24 ns 24 ns 112000000
BM_isAllowed2/4/min_time:2.000 24 ns 24 ns 119466667
BM_isAllowed2/8/min_time:2.000 24 ns 24 ns 119466667
BM_isAllowed2/16/min_time:2.000 24 ns 24 ns 119466667
BM_isAllowed2/22/min_time:2.000 24 ns 24 ns 119466667
Note: As Martin Bonner pointed out, the isAllowed function seems to have a logic flaw, don't use it production code.
Below you can find the code I used to do this benchmark, please let me know if there are any flaws in it as I'm not familiar with the Google library.
Important, this code was compiled with Visual Studio 2015 and optimization should be disabled for the section that we want to test.
#include "benchmark/benchmark.h"
using namespace std;
using namespace benchmark;
#pragma optimize( "[optimization-list]", {on | off} )
volatile const long extraDay = 24 * 60 * 60;
#pragma optimize( "", off )
struct timing {
short hour;
short minute;
timing(short hour, short minute) : hour(hour), minute(minute) {}
};
static void BM_isAllowed(benchmark::State& state) {
while (state.KeepRunning())
{
timing from(state.range(0), state.range(0));
timing to(state.range(0), state.range(0));
timing current(state.range(0), state.range(0));
bool b = !(((from.hour > to.hour && (current.hour >= from.hour || current.hour <= to.hour)) ||
(current.hour >= from.hour && current.hour <= to.hour)) &&
!(current.minute > from.minute && current.minute < to.minute));
}
}
static void BM_isAllowed2(benchmark::State& state) {
while (state.KeepRunning())
{
timing from(state.range(0), state.range(0));
timing to(state.range(0), state.range(0));
timing current(state.range(0), state.range(0));
bool b;
long secsFrom = secsFrom = (from.hour * 3600) + (from.minute * 60);
long secsTo = (to.hour * 3600) + (to.minute * 60);
long secsCurrent = (current.hour * 3600) + (current.minute * 60);
if (secsFrom > secsTo)
secsTo += extraDay;
if (secsCurrent > secsFrom && secsCurrent < secsTo)
b = false;
else
b = true;
}
}
#pragma optimize( "", on )
BENCHMARK(BM_isAllowed)->RangeMultiplier(2)->Range(2, 22)->MinTime(2);
BENCHMARK(BM_isAllowed2)->RangeMultiplier(2)->Range(2, 22)->MinTime(2);
BENCHMARK_MAIN();