How to get an objective evaluation of the execution time of a C++ code snippet?

Question

I am following this post How to Calculate Execution Time of a Code Snippet in C++, and a nice solution is given in this post for calculating the execution time of a code snippet. However, when I use this solution to measure the execution time of my code snippet in linux, I found that everything I run the program, the execution time given by the solution is different. So my question is how I can have an objective evaluation of the execution time. The objective evaluation is important to me as I use the following scheme to evaluate the different implementation of the same task:

void main()
{
int64 begin,end; 
begin = GetTimeMs64();
execute_my_codes_method1();
end = GetTimeMs64();
std::cout<<"Execution time is "<<end-begin<<std::endl;
}

First, I run the above code to get the execution time for the first method. After that, I will change the above codes by invoking execute_my_codes_method2() and get the execution time for the second method.

 void main()
    {
    int64 begin,end; 
    begin = GetTimeMs64();
    execute_my_codes_method2();//execute_my_codes_method1();
    end = GetTimeMs64();
    std::cout<<"Execution time is "<<end-begin<<std::endl;
    }

By comparing the different execution time I expect to compare the efficiency of these two different implementations.

The reason why I changed the codes and run different implementations is because it is very difficult to call them sequentially in one program. Therefore, for the same program running it at different times will lead to different execution time means that comparing different implementation methods using the calculated execution time is meaningless. Any suggestions on this problem? Thanks.

Make many calls to measure and do statistics to get more reliable and comparable results. Just using single calls to test may depend on so many factors that are completely unrelated to your programs code. — πάντα ῥεῖ, Jul 04 '14 at 15:49
On top of this, code run in a toy environment may perform differently than code run in the real environment. As an example, in the real environment the code may be run more than once, the data it accesses may be in a cache or not, tables or code it uses may experience different cache misses, the branch prediction cache may be exhausted differently by different code, other threads, etc. Perfection is not attainable, all you can do is strive towards it. — Yakk - Adam Nevraumont, Jul 04 '14 at 15:51
@πάνταῥεῖ Thanks for your comments. I can use valgrind/callgrind to profile the program in order to find the bottle-neck of the program. If I understand well about valgrind/callgrind it is hard to compare two methods if they are not in the same program. — feelfree, Jul 04 '14 at 15:56
@feelfree Can't you isolate running those functions from unit test cases (which would even offer doing the repetition automatically) and a unit tester? What's the real problem that both of these methods can't be run in the same executable? — πάντα ῥεῖ, Jul 04 '14 at 16:00
@πάνταῥεῖ Thanks, and I double-checked the codes, and it is possible to run it in the same executable with some additional work. For me, I just want to quickly find the best implementation method. Considering the running valgrind/callgrind may take some time, I was just wondering just using the execution time to compare different implementations. I can quickly select the one that gives me the least execution time. — feelfree, Jul 04 '14 at 16:11
@πάνταῥεῖ What will it be another solution if I do not run both methods in the same program? One solution I am thinking of is that if two implementations share the same function (for example, reading data function), I can run valgrind/callgrind and obtain the relative time for the implementation method compared to the common function (reading data function). By comparing different relative time for different methods, I can figure out which implementation is the best. — feelfree, Jul 04 '14 at 16:16
@feelfree IMHO it's still best to have unit tests upfront. That will offer not only to make such decisions quickly, but also prove the original functionality isn't broken by the alternate implementation. You should do unit testing for the first line of code written in a system, introducing it later is a PITA and never will be done as for my experience. Though besides valgrind there are other methods for profiling, [gprof](http://www.thegeekstuff.com/2012/08/gprof-tutorial/) for example. May be this gives you better comparable results than valgrind. But I can't really tell, rarely used it. — πάντα ῥεῖ, Jul 04 '14 at 16:35

πάντα ῥεῖ · Accepted Answer · 2014-07-04T19:02:24.560

Measuring a single call's execution time is pretty useless for judging any performance improvements. There are too many factors that influence the actual execution time of a function. If you are measure timing you should make many calls to the function measure the time and build a statistical average of the measured execution times

void main() {
    int64 begin = 0, end = 0; 
    begin = GetTimeMs64();
    for (int i = 0; i < 10000; ++i) {
        execute_my_codes_method1();
    }
    end = GetTimeMs64();
    std::cout<<"Average execution time is "<< (end - begin) / 10000 << std::endl;
}

Additionally instead of what's shown above, the presence of having unit tests for your functions up front (using a decent testing framework like e.g. Google Test), will making such quick judgments as you mention a lot quicker and easier.

Not only you can determine how often the test cases should be run (to gather the statistical data for average time calculation), the unit tests can also prove that the desired/existing functionality and input/output consistency wasn't broken by an alternate implementation.

As an extra benefit (as you mentioned difficulties running the two functions in question sequentially), most of those unit test frameworks allow to have a SetUp() and TearDown() method, that are executed before/after running a test case. Thus you can easily provide consistent state of predicate or invariant conditions for each single test case run.

As a further option, instead of measuring to gather the statistical data yourself, you can use profiling tools that work via code instrumentation. A good sample for this is GCC's gprof. I think there's information gathered for how often every underlying function was called and which time the execution took. This data can be analyzed later with the tool, to find potential bottlenecks in your implementations.

Additionally,- if you decide to provide unit tests in future -, you may want to ensure all of your code path's regarding various input data situations are covered well by your test cases. A very good example,for how to do this, is GCC's gcov instrumentation. To analyze the gathered information about code coverage you can use lcov, that visualizes the results quite nicely and comprehensive.

There is no need to take the time each iteration, just take the time for the whole loop. — Daniel, Jul 04 '14 at 17:27

How to get an objective evaluation of the execution time of a C++ code snippet?

1 Answers1