0

I have two different C/C++ functions that do something. I think one may be a lot faster and I want to find out how much faster. So I do it in a loop a jillion times and time the loop.

Further, I think it might be so fast that the raw time just doing a loop to test it a jillion times could be taking, so I want to subtract out the cost of doing an empty loop.

And yet, code such as the following quite often reports the empty loop taking more time than a loop with a function, despite doing the loop 100 million times. I've taken to doing the empty loop twice but still it's sometimes slower.

Example code comparing two functions with two empty loops:

  tp1 = chrono::high_resolution_clock::now();

  for ( i = 0; i < iRepeats; i++ )
      ;

  tp2 = chrono::high_resolution_clock::now();

  for ( i = 0; i < iRepeats; i++ )
       iLen = Function1();

  tp3 = chrono::high_resolution_clock::now();

  for ( i = 0; i < iRepeats; i++ )
       iLen = Function2();

  tp4 = chrono::high_resolution_clock::now();

  for ( i = 0; i < iRepeats; i++ )
      ;

  tp5 = chrono::high_resolution_clock::now();

  double dDurEmpty1 = chrono::duration<double>( tp2 - tp1 ).count();
  double dDurEmpty2 = chrono::duration<double>( tp5 - tp4 ).count();
  double dDurEmpty  = AKMin( dDurEmpty1, dDurEmpty2 );
  double dDurA = chrono::duration<double>( tp3 - tp2 ).count();
  double dDurB = chrono::duration<double>( tp4 - tp3 ).count();

  printf( "Length:          %7.4fx faster\n",
            ( dDurA - dDurEmpty ) / ( dDurB - dDurEmpty )   );

I'm not wedded to this general approach at all if there's any suggestion of different ways to do it.

Swiss Frank
  • 1,985
  • 15
  • 33
  • Why not just use a benchmarking tool? – Stephen Newell Apr 25 '20 at 20:13
  • 2
    Some IDEs have profilers built into them -- they can give you details on function times (based on sampling), and even which parts of functions. Without this, your method is fine, but generally speaking, for comparisons, you should use optimizations, in which case your first and last loops will be 0 since they'll be optimized right out. Even your function calls may be optimized away ... – ChrisMM Apr 25 '20 at 20:14
  • The unit tests for my modules all report performance etc., and run on multiple platforms (Red Hat and Windows, at a minimum, and several versions thereof). I'm not using an IDE but while profilers are available on Linux, to incorporate them into the unit test suite would be a pretty big project, and even then they wouldn't be portable. It's educational to see how numbers compare across platforms for instance. – Swiss Frank Apr 25 '20 at 20:40
  • Empty loop slower than with a function call: that sounds like you once again tested with optimization disabled, which is *not* educational or informative. Seriously stop doing that. It doesn't reflect anything that will happen when you compile with normal optimzation levels because disabling optimization creates *different* bottlenecks. e.g. [Loop with function call faster than an empty loop](https://stackoverflow.com/q/45442458) – Peter Cordes Apr 25 '20 at 21:51
  • If doing it twice matters, that's because the first time was before the CPU jumped up to max turbo clock speed. Failure to warm up the CPU is a common benchmark pitfall on modern systems that aggressively save power when idle. – Peter Cordes Apr 25 '20 at 21:53

0 Answers0