2

I have a program and I want to measure its execution (wallclock) time for different input sizes.

In some other similar questions I read that using clock_gettime in the source code wouldn't be reliable because of the CPUs branch predictor, register renaming, speculative execution, out-of-order execution etc., and sometimes even the optimizer can move the clock_gettime call someplace other than where I put it.

But these questions I read were about measuring the time of a specific function. Would these problems still exist if I'm measuring for the whole program (i.e. the main function)? I'm looking for relative measurements, how the execution time changes for different input sizes, not the absolute value.

How would I get better results? Using timing functions in the code:

start = clock_gettime();
do_stuff();
end = clock_gettime();
execution_time = end - start;

or with the time command in bash:

time ./program
devil0150
  • 1,350
  • 3
  • 13
  • 36

1 Answers1

1

Measuring in the program will give you a more accurate answer. Sure, in theory, in some cases you can get the clock_gettime calls moved where you don't expect them. In practice, it will not happen if you have only a function call in between. (If in doubt, check the resulting assembler code)

Calling time in shell will include things you don't care about, like the time it takes to load your executable and get to the interesting point. On the other hand, if your do_stuff takes a few seconds, then it doesn't really matter.

I'd go with the following recommendation:

  • If you can isolate your function easily and make it take a few seconds (you can also loop it, but measure empty loop for comparison as well), then either clock_gettime or time will do just fine.
  • If you cannot isolate easily, but your function consistently takes hundreds of milliseconds, use clock_gettime.
  • If you cannot isolate and you're optimising something tiny, have a look at rdtsc timing for a measuring a function which talks about measuring actual executed cycles.
viraptor
  • 33,322
  • 10
  • 107
  • 191
  • What do you mean "isolate and make it take a few seconds"? Should I add a sleep call every iteration? – devil0150 May 30 '17 at 10:31
  • @devil0150 definitely not! :) I mean, loop it enough times that the execution time is longer than random delays / context switches / filling the cache / ... If your runtime of the measured code normally differs by 10ms, or can get delayed by 10ms just because something decided to swap at the same time, you want your code to run for 1s or so, so that error is insignificant. Alternatively you can run it hundreds of times, measuring each run precisely and select the lowest time. – viraptor May 31 '17 at 00:04