6

Which of the following is better and why? (Particular to c++)

a.

int i(0), iMax(vec.length());//vec is a container, say std::vector
for(;i < iMax; ++i)
{
  //loop body
}

b.

for( int i(0);i < vec.length(); ++i)
{
  //loop body
}

I have seen advice for (a) because of the call to length function. This is bothering me. Doesn't any modern compiler do the optimization of (b) to be similar to (a)?

Amol Gawai
  • 3,220
  • 4
  • 23
  • 25

14 Answers14

13

Example (b) has a different meaning to example (a), and the compiler must interpret it as you write it.

If, (for some made-up reason that I can't think of), I wrote code to do this:

for( int i(0);i < vec.length(); ++i)
{
    if(i%4 == 0)
       vec.push_back(Widget());
}

I really would not have wanted the compiler to optimise out each call to vec.length(), because I would get different results.

Andrew Shepherd
  • 44,254
  • 30
  • 139
  • 205
  • 1
    Perfect answer. You might want to add that if you don't change the length of the vector inside the loop, construct (a) in the question (or some of the similar suggestions in the other answers) should be preferred because it is hard for the compiler to optimize the call to length() away – stephan Jun 25 '09 at 07:15
  • 4
    compiler optimization or not if you use iterator with begin() - end() you'll get the end pointer once and you don't need to worry about size and all ... – stefanB Jun 25 '09 at 07:45
10

I like:

for (int i = 0, e = vec.length(); i != e; ++i)

Of course, this would also work for iterators:

for (vector<int>::const_iterator i = v.begin(), e = v.end(); i != e; ++i)

I like this because it's both efficient (calling end() just once), and also relatively succinct (only having to type vector<int>::const_iterator once).

Assaf Lavie
  • 73,079
  • 34
  • 148
  • 203
  • This approach seems to be more readable to me. Is it documented in any book? Which? I am reading this for the first time on stackoverflow. – Amol Gawai Jun 25 '09 at 06:18
  • 1
    No, `end` is called more than once but since the call is inlined that won't matter. – Konrad Rudolph Jun 25 '09 at 07:23
  • @Konrad, why do you say end is called more than once in the 2nd example? How many times do you think it's called (and why)? – Assaf Lavie Jun 25 '09 at 09:32
  • what's wrong with calling i != v.end() in loop? it's probably inlined and end() returns pointer one past end of array ... – stefanB Jul 02 '09 at 10:58
  • @stefan, true, but they also might not be. And in any case, why not write it in the way that you know for certain is most efficient, regardless of compiler optimizations? – Assaf Lavie Jul 02 '09 at 15:38
  • @stefan, I suggest you do some measurements with your compiler and measure the actual difference between these styles. I did this once for VC 2003 and there was a real difference. – Assaf Lavie Jul 02 '09 at 15:39
5

I'm surprised nobody has said the obvious:

In 99.99% of cases, it doesn't matter.

Unless you are using some container where calculating size() is an expensive operation, it is unfathomable that your program will go even a few nanoseconds slower. I would say stick with the more readable until you profile your code and find that size() is a bottleneck.

rlbond
  • 65,341
  • 56
  • 178
  • 228
  • 1
    It means something different! So it _does_ matter! – xtofl Jun 25 '09 at 07:44
  • 3
    In 99.99% of cases, it doesn't matter. It only matters (a) if size() changes inside the loop, or (b) if checking size() is prohibitively slow. – rlbond Jun 25 '09 at 16:03
3

There are two issues to debate here:

  1. The variable scope
  2. The end condition re-evaluation

Variable scope

Normally, you wouldn't need the loop variable to be visible outside of the loop. That's why you can declare it inside the for construct.

End condition re-evaluation

Andrew Shepherd stated it nicely: it means something different to put a function call inside the end condition:

for( vector<...>::size_type i = 0; i < v.size(); ++i ) { // vector size may grow.
   if( ... ) v.push_back( i ); // contrived, but possible
}

// note: this code may be replaced by a std::for_each construct, the previous can't.
for( vector<...>::size_type i = 0, elements = v.size(); i != elements; ++i ) {
}
xtofl
  • 40,723
  • 12
  • 105
  • 192
2

Why is it bodering you? Those two alternatives dont see to be doing the same. One is doing a fixed number of iterations, while the other is dependant on the loops body.

Another alternative colud be

for (vector<T>::iterator it=vec.begin();it!=vec.end();it++){
 //loop body
}
Tom
  • 43,810
  • 29
  • 138
  • 169
  • 1
    And similar to other answers, you can also use: for (vector::iterator it = vec.begin(), end = vec.end(); it != end; ++it) { process(*it); } –  Jun 25 '09 at 05:22
  • 2
    Furthermore, once it's down to a single function: std::for_each(vec.begin(), vec.end(), process); –  Jun 25 '09 at 05:25
2

Unless you need the loop variable outside the loop, the second approach is preferable.

Iterators will actually give you as good or better performance. (There was a big comparison thread on comp.lang.c++.moderated a few years back).

Also, I would use

int i = 0;

Rather than the constructor like syntax you're using. While valid, it's not idiomatic.

nsanders
  • 12,250
  • 2
  • 40
  • 47
2

Somewhat unrelated:

Warning: Comparison between signed and unsigned integer.

The correct type for array and vector indices is size_t.

Strictly speaking, in C++ it is even std::vector<>::size_type.

Amazing how many C/C++ developers still get this one wrong.

DevSolar
  • 67,862
  • 21
  • 134
  • 209
  • 3
    Yup - especially since the correct answer is vector<>::size_type not size_t. – MSalters Jun 25 '09 at 07:42
  • Strictly speaking, yes, you are correct. But I'd be happy enough to see size_t already, and the chances of educating people to use `vector<>::size_type` are... negligible. – DevSolar Jun 25 '09 at 09:55
  • The value of using vector<>::size_type over size_t is negligible as well. It's only useful in cases where you are already working with a template container (in which case container_type::size_type is more explicit than size_t). – Tom Jun 25 '09 at 12:12
1

Let's see on the generated code (I use MSVS 2008 with full optimization).

a.

int i(0), iMax(vec.size());//vec is a container, say std::vector
for(;i < iMax; ++i)
{
  //loop body
}

The for loop produces 2 assembler instructions.

b.

for( int i(0);i < vec.size(); ++i)
{
  //loop body
}

The for loop produces 8 assembler instructions. vec.size() is successfully inlined.

c.

for (std::vector<int>::const_iterator i = vec.begin(), e = vec.end(); i != e; ++i)
{
  //loop body
}

The for loop produces 15 assembler instructions (everything is inlined, but the code has a lot of jumps)

So, if your application is performance critical use a). Otherwise b) or c).

Sergey Podobry
  • 7,101
  • 1
  • 41
  • 51
  • Interesting. How to know assembler instruction details? – Amol Gawai Jun 25 '09 at 06:33
  • Except that in c, accessing elements in vec is fewer assembly instructions than in a or b. – rlbond Jun 25 '09 at 06:43
  • Got the answer on stackoverflow itself. http://stackoverflow.com/questions/840321/how-can-i-see-the-assembly-code-for-a-c-program – Amol Gawai Jun 25 '09 at 06:59
  • 1
    Number of assembly instructions usually doesn't matter. Modern CPUs remap instructions. They also have a bandwidth limit on accessing main memory. Since all loops access the same elements, they're all likely to hit this same limit. – MSalters Jun 25 '09 at 07:41
  • Modern CPUs can remap instructions but cannot skip them. And CPU cache may reduce influence of the memory bandwidth limit. But I agree with you, in most cases there will be no significant differrence between a), b) and c). – Sergey Podobry Jun 25 '09 at 09:08
1

It should be noted that the iterator examples:

for (vector<T>::iterator it=vec.begin();it!=vec.end();it++){
 //loop body
}

could invalidate the loop iterator 'it' should the loop body cause the vector to reallocate. Thus it is not equivalent to

for (int i=0;i<vec.size();++i){
 //loop body
}

where loop body adds elements to vec.

equackenbush
  • 136
  • 3
0

Simple question: are you modifying vec in the loop?

answer to this question will lead to your answer too.

jrh

jrharshath
  • 25,975
  • 33
  • 97
  • 127
0

It's very hard for a compiler to hoist the vec.length() call in the safe knowledge that it's constant, unless it gets inlined (which hopefully it often will!). But at least i should definitely be declared in the second style "b", even if the length call needs to be "manually" hoisted out of the loop!

Alex Martelli
  • 854,459
  • 170
  • 1,222
  • 1,395
0

This one is preferable:

typedef vector<int> container; // not really required,
                               // you could just use vector<int> in for loop

for (container::const_iterator i = v.begin(); i != v.end(); ++i)
{
    // do something with (*i)
}
  • I can tell right away that the vector is not being updated
  • anyone can tell what is happening here
  • I know how many loops
  • v.end() returns pointer one past the last element so there's no overhead of checking size
  • easy to update for different containers or value types
stefanB
  • 77,323
  • 27
  • 116
  • 141
0

(b) won't calculate/call the function each time.

-- begin excerpt ----

Loop Invariant Code Motion: GCC includes loop invariant code motion as part of its loop optimizer as well as in its partial redundancy elimination pass. This optimization removes instructions from loops, which compute a value which does not change throughout the lifetime of a loop.

--- end excerpt --

More optimizations for gcc:

https://www.in.redhat.com/software/gnupro/technical/gnupro_gcc.php3

0

Why not sidestep the issue entirely with BOOST_FOREACH

#include <boost/foreach.hpp>

std::vector<double> vec;

//...

BOOST_FOREACH( double &d, vec)
{
    std::cout << d;
}
Roddy
  • 66,617
  • 42
  • 165
  • 277