6

I'm trying to find a way to test how long it takes a block of C++ code to run. I'm using it to compare the code with different algorithms and under different languages, so ideally I would like a time in seconds / milliseconds. In Java I'm using something like this:

long startTime = System.currentTimeMillis();

function();

long stopTime = System.currentTimeMillis();
long elapsedTime = stopTime - startTime; 

Is there a good way to get an accurate time like that in C++ (Or should I use some other means of benchmarking)?

Jeremy
  • 2,826
  • 7
  • 29
  • 25
  • 4
    Related question: http://stackoverflow.com/questions/275004/c-timer-function-to-provide-time-in-nano-seconds – Andy White Apr 28 '09 at 00:55
  • 1
    Timing is platform-dependent. You should list which platform(s) you're using. – David Thornley Jul 14 '10 at 15:17
  • It's frustrating that none of the answers here have a statistical component. – uckelman Sep 28 '11 at 14:06
  • I'm going to go ahead and answer my own question by saying that the link () in the comment posted by [Andy White](http://stackoverflow.com/users/60096/andy-white) was what I was looking for. – Jeremy Jul 14 '10 at 15:10

14 Answers14

9

Use the best counter available on your platform, fall back to time() for portability. I am using QueryPerformanceCounter, but see the comments in the other reply.

General advise:

The inner loop should run at least about 20 times the resolution of your clock, to make the resolution error < 5%. (so, when using time() your inner loop should run at least 20 seconds)

Repeat these measurements, to see if they are consistent.

I use an additional outer loop, running ten times, and ignoring the fastest and the slowest measurement for calculating average and deviation. Deviation comes handy when comparing two implementations: if you have one algorithm taking 2.0ms +/-.5, and the other 2.2 +/- .5, the difference is not significant to call one of them "faster". (max and min should still be displayed). So IMHO a valid performance measurement should look something like this:

10000 x 2.0 +/- 0.2 ms (min = 1.2, , max=12.6), 10 repetitions

If you know what you are doing, purging the cache and setting thread affinity can make your measurements much more robust.

However, this is not without pifalls. The more "stable" the measurement is, the less realistic it is as well. Any implementation will vary strongly with time, depending on the state of data and instruction cache. I'm lazy here, useing the max= value to judge first run penalty, this might not be sufficient for some scenarios.

peterchen
  • 40,917
  • 20
  • 104
  • 186
8

Execute the function a few thousand times to get an accurate measurement.

A single measurement might be dominated by OS events or other random noise.

S.Lott
  • 384,516
  • 81
  • 508
  • 779
5

In Windows, you can use high performance counters to get more accurate results:

You can use the QueryPerformanceFrequency() function to get the number of high frequency ticks per seconde and the user the QueryPerformanceCounter() before and after the function you want to time.

Of course, this method is not portable...

MartinStettner
  • 28,719
  • 15
  • 79
  • 106
  • Be careful with the HF counters. Multi-processor systems sometimes make their use...interesting. They work with processor-specific counters, so if you end up with your code on a different CPU, the counters can be off (depending on your exact hardware, of course). – Michael Kohne Apr 28 '09 at 01:05
  • 1
    Setting thread affinity to a single CPU for the duration of the benchmark will eliminate SMP-related time warps. There also may be clock ramping due to power management, which would matter if the CPU becomes idle because the benchmarked code sleeps or blocks on I/O. For AMD systems, installing the AMD Processor Driver will improve QPC() synchronization considerably. Windows Vista and Windows 7 use the HPET timer (if available) instead of the TSC, so the TSC problems may eventually go away (when/if Windows XP goes away). – bk1e Apr 28 '09 at 02:44
5

Have you considered actually using a profiler? Visual Studio Team System has one built in, but there are others available like VTune and GlowCode.

See also What's the best free C++ profiler for Windows?

Community
  • 1
  • 1
rlbond
  • 65,341
  • 56
  • 178
  • 228
  • Gah! Do they have ones that don't cost so much? – Billy ONeal Apr 28 '09 at 01:47
  • 4
    Profilers are good for answering the question "what is the slowest part of my program?" but usually not so good at answering the question "how slow is my program?" accurately. – bk1e Apr 28 '09 at 02:50
4

What's wrong with clock() and CLOCKS_PER_SEC? They are standard C89.

Something like (nicked from MSDN):

   long i = 6000000L;
   clock_t start, finish;
   double  duration;

   // Measure the duration of an event.
   printf( "Time to do %ld empty loops is ", i );
   start = clock();
   while( i-- ) 
      ;
   finish = clock();
   duration = (double)(finish - start) / CLOCKS_PER_SEC;
   printf( "%2.1f seconds\n", duration );
PowerApp101
  • 1,798
  • 1
  • 18
  • 25
2

OVERVIEW

I have written a simple semantic hack for this.

  • Easy to use
  • Code looks neat.

MACRO

#include <time.h>

#ifndef SYSOUT_F
#define SYSOUT_F(f, ...)      _RPT1( 0, f, __VA_ARGS__ ) // For Visual studio
#endif

#ifndef speedtest__             
#define speedtest__(data)   for (long blockTime = NULL; (blockTime == NULL ? (blockTime = clock()) != NULL : false); SYSOUT_F(data "%.9fs", (double) (clock() - blockTime) / CLOCKS_PER_SEC))
#endif

USAGE

speedtest__("Block Speed: ")
{
    // The code goes here
}

OUTPUT

Block Speed: 0.127000000s
Mathew Kurian
  • 5,949
  • 5
  • 46
  • 73
2

You can use the time() function to get a timer with a resolution of one second. If you need more resolution, you could use gettimeofday(). The resolution of that depends on your operating system and runtime library.

Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
2

I always use boost::timer or boost::progress_timer.

psudo-code:

#include <boost/timer.hpp>

boost::timer t;

func1();
cout << "fun1: " << t.elapsed();

t.restart();
func2();
cout << "func2(): " << t.elapsed();
t.g.
  • 1,719
  • 2
  • 14
  • 25
2

If you want to check you performance, you should consider measuring used processor time, not the real time you are trying to measure now. Otherwise you might get quite inaccurate times if some other application running in background decides to do some heavy calculations at the same time. Functions you want would be GetProcessTimes on Windows and getrusage on Linux.

Also you should consider using profilers, as other people suggested.

n0rd
  • 11,850
  • 5
  • 35
  • 56
0

Your platform for deployment can have a serious impact on your clock precision. If you are sampling inside of a Virtual Machine, all bets are off. The system clock in a VM floats in relation to the physical clock and has to be occasionally resynched. It is almost a certainty that this will happen will happen, given the mischievous nature of Mr Murphy in the software universe

James Pulley
  • 5,606
  • 1
  • 14
  • 14
0

You can simply use time() within your code to measure within accuracy of seconds. Thorough benchmarks should run many iterations for accuracy so seconds should be a large enough margin. If you are using linux, you can use the time utility as provided by the command line like so:

[john@awesome]$time ./loops.sh

real    0m3.026s
user    0m4.000s
sys     0m0.020s
John T
  • 23,735
  • 11
  • 56
  • 82
0

On unix systems (Linux, Mac, etc.) you can use time utility like so:

$ time ./my_app
fengshaun
  • 2,100
  • 1
  • 16
  • 25
0

If your function is very fast, a good practice is to time the function in a loop and then subtract the loop overhead.

Something like this:

int i;
int limit=1000000;
int t0=getTime();
for(i=0; i < limit; ++i)
   ;
int t1=getTime();
int loopoverhead = t1-t0;
t0=getTime();
for(i=0; i < limit; ++i)
    function();
t1=getTime();
double tfunction = (1.0/limit)*((t1-t0)-loopoverhead);
jfklein
  • 877
  • 1
  • 11
  • 13
0

Run it 1000 times as 100 iterations * 10 iterations, where you unroll the inner loop to minimize overhead. Then seconds translate to milliseconds.

As others have pointed out, this is a good way to measure how long it takes.

However, if you also want to make it take less time, that is a different goal, and needs a different technique. My favorite is this.

Community
  • 1
  • 1
Mike Dunlavey
  • 40,059
  • 14
  • 91
  • 135