I noticed that code on my computer runs significantly slower if I use std::thread_sleep_for before execution. I wrote a small google benchmark for demonstration, just some simple for loop that is execution a 100 million times.
BENCHMARK CODE WITHOUT SLEEP_FOR:
int test(int iterations) {
int a;
for (int i = 0; i < iterations; i++) {
a = i * 200 + 100;
}
return a;
}
void runBenchmark(benchmark::State &state) {
for (auto _ : state) {
test(100000000);
}
}
BENCHMARK RESULTS WITHOUT SLEEP_FOR:
Time CPU Iterations
../1/process_time 145 ms 145 ms 1
../1/process_time 144 ms 144 ms 1
../1/process_time 144 ms 144 ms 1
../1/process_time 144 ms 144 ms 1
../1/process_time 144 ms 144 ms 1
../1/process_time 144 ms 144 ms 1
../1/process_time 144 ms 144 ms 1
../1/process_time 145 ms 145 ms 1
../1/process_time 144 ms 144 ms 1
../1/process_time 144 ms 144 ms 1
../1/process_time_mean 144 ms 144 ms 10
../1/process_time_median 144 ms 144 ms 10
../1/process_time_stddev 0.419 ms 0.417 ms 10
../1/process_time_min 144 ms 144 ms 10
The second time i let the thread sleep 100ms before executing the loop. The result is of course that the wall time is a 100ms higher. But in addition the CPU time is also increased by about 30ms:
CODE WITH SLEEP_FOR:
int test(int iterations) {
int a;
for (int i = 0; i < iterations; i++) {
a = i * 200 + 100;
}
return a;
}
void runBenchmark(benchmark::State &state) {
for (auto _ : state) {
std::this_thread::sleep_for(100ms);
test(100000000);
}
}
BENCHMARK RESULTS WITH SLEEP_FOR:
Time CPU Iterations
../1/process_time 275 ms 175 ms 1
../1/process_time 273 ms 173 ms 1
../1/process_time 272 ms 172 ms 1
../1/process_time 276 ms 176 ms 1
../1/process_time 274 ms 173 ms 1
../1/process_time 271 ms 171 ms 1
../1/process_time 271 ms 171 ms 1
../1/process_time 273 ms 173 ms 1
../1/process_time 276 ms 176 ms 1
../1/process_time 274 ms 173 ms 1
../1/process_time_mean 274 ms 173 ms 10
../1/process_time_median 274 ms 173 ms 10
../1/process_time_stddev 1.75 ms 1.75 ms 10
../1/process_time_min 275 ms 175 ms 10
What might be the explanation for that? When I first encountered that behaviour in another program I am just working on I was expecting some kind of scheduling issue. But since the benchmark clearly states, that CPU time itself is increased, I really have no idea what is going on.
The CPU Model that the code is running on is an Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz.
EDIT: Here is an additional Image of my original program. It shows the execution of 3 neural networks. Sleep_for is only executed before running the first network. It shows that especially the first layers of the first network run extremely slow in comparison.
EDIT: Code has been compiled using llvm-8 and -O2 Optimization flag on an ubuntu-18.04 machine
kind regards lpolari