Which algorithm brings the best performance?

Question

I have a piece of code that is really dirty.

I want to optimize it a little bit. Does it makes any difference when I take one of the following structures or are they identical with the point of view to performance in c++ ?

for(unsigned int i = 1; i < entity.size(); ++i) begin
if
 if ... else ...
for end

for(unsigned int i = 1; i < entity.size(); ++i) begin
if
 if ... else ...
for end

for(unsigned int i = 1; i < entity.size(); ++i) begin
if
 if ... else ...
for end
....

or

for(unsigned int i = 1; i < entity.size(); ++i) begin
if
 if ... else ...
if
 if ... else ...
if
 if ... else ...
....
for end

Thanks in Advance!

Reading this will be useful: [Why is one loop so much slower than two loops](http://stackoverflow.com/questions/8547778/why-is-one-loop-so-much-slower-than-two-loops). — jogojapan, Jun 25 '13 at 03:50
This isn't C++ syntax. In fact, it's hard to imagine it being the syntax of any language ... what does `if if ... else ...` mean? — Jim Balter, Aug 10 '13 at 05:42

score 3 · Answer 1 · answered Jun 25 '13 at 02:37

3

Both are O(n). As we do not know the guts of the various for loops it is impossible to say.

BTW - Mark it as pseudo code and not C++

answered Jun 25 '13 at 02:37

Ed Heal

59,252
17
87
127

Tony Delroy · Answer 2 · 2013-06-25T06:35:43.427

The 1st one may spend less time incrementing/testing i and conditionally branching (assuming the compiler's optimiser doesn't reduce it to the equivalent of the second one anyway), but with loop unrolling the time taken for the i loop may be insignificant compared to the time spent within the loop anyway.

Countering that, it's easily possible that the choice of separate versus combined loops will affect the ratio of cache hits, and that could significantly impact either version: it really depends on the code. For example, if each of the three if/else statements accessed different arrays at index i, then they'll be competing for CPU cache and could slow each other down. On the other hand, if they accessed the same array at index i, doing different steps in some calculation, then it's probably better to do all three steps while those memory pages are still in cache.

There are potential impacts other than caches - from impact to register allocation, speed of I/O devices (e.g. if each loop operates on lines/records from a distinct file on different physical drives, it's very probably faster to process some of each file in a loop, rather than sequentially process each file), etc..

If you care, benchmark your actual application with representative data.

score 1 · Answer 3 · answered Jun 25 '13 at 02:44

Just from the structure of the loop it is not possible to say which approach will be faster. Algorithmically, both has the same complexity O(n). However, both might have different performance numbers depending upon the kind of operation you are performing on the elements and the size of the container.

The size of container may have an impact on locality and hence the performance. So generally speaking, you would like to chew the data as much as you can, once you get it into the cache. So I would prefer the second approach. To get a clear picture you should actually measure the performance of you approach.

score 0 · Answer 4 · answered Jun 25 '13 at 02:39

The second is only slightly more efficient than the first. You save:

Initialization of loop index
Calling size()
Comparing the loop index with the size()`
Incrementing the loop index

These are very minor optimizations. Do it if it doesn't impact readability.

score 0 · Answer 5 · answered Jun 25 '13 at 02:41

I would expect the second approach to be at least marginally more optimal in most cases as it can leverage the locality of reference with respect to access to elements of the entity collection/set. Note that in the first approach, each for loop would need to start accessing elements from the beginning; depending on the size of the cache, the size of the list and the extent to which compiler can infer and optimize, this may lead to cache misses when a new for loop attempts to read an element even though that element would have been read already by a preceding loop.

And as others have pointed out in their replies, the 2nd approach may also be slightly more optimal in terms of loop index initializations and comparisons. — so1, Jun 25 '13 at 02:46

Which algorithm brings the best performance?

5 Answers5