3

Suppose we have a loop that iterates many times:

for (int i=0; i < 1000000; ++i) {
  int s = 100;
  s += i;
  cout << s;
}

We are only using s inside the loop body, so ideally we'd like to declare it there so it won't pollute the enclosing namespace.

I'm wondering if there's any disadvantage to that. For example, will it incur a performance cost, because the program re-declares s on every iteration?

Dun Peal
  • 16,679
  • 11
  • 33
  • 46
  • 2
    yes on every iteration it will be constructed and destroyed – deW1 Apr 23 '17 at 16:39
  • @deW1: thanks, I fixed it in my question. I did not intend for it to be UB. – Dun Peal Apr 23 '17 at 16:41
  • 1
    @deW1 I'd be interested to hear an authoritative answer on how compilers would optimize this, though, for simple types. – jwimberley Apr 23 '17 at 16:41
  • 1
    What's the alternative, by the way? Just enclose the whole loop in a block? – Dun Peal Apr 23 '17 at 16:42
  • 1
    I believe this is answered here: http://stackoverflow.com/questions/982963/is-there-any-overhead-to-declaring-a-variable-within-a-loop-c (somewhat C++ specific) and here: http://stackoverflow.com/questions/407255/difference-between-declaring-variables-before-or-in-loop (somewhat Java specific). In summary, the compiler will likely optimize this and there is negligible performance difference. – jwimberley Apr 23 '17 at 16:43
  • 2
    1. declare it outside 2. declare it after `int i` both will work – deW1 Apr 23 '17 at 16:43
  • @Sridharan that's Java. – Quentin Apr 23 '17 at 16:53

3 Answers3

7

Conceptually that variable is constructed and destructed on each iteration.

But does it affect performance? Well, you can check your case right here. Delete int on line 7 to switch between the loop-local and function-local variables.
Conclusion: no difference whatsoever. The assembly is the same!

So, just use what makes sense in your code. If you need one object per iteration, make one object per. The optimizer is smarter than you think. If that wasn't enough, you'd come back to it with profiling data and careful tweaking, not broad guidelines.

Quentin
  • 62,093
  • 7
  • 131
  • 191
  • Brilliant. So conceptually, it is different, and in theory, that can impose a performance penalty with a naive compiler, but in practice, modern compilers should be able to optimize this away. – Dun Peal Apr 23 '17 at 16:57
  • 1
    @DunPeal exactly. Even after having seen quite a bit of C++, I'm still regularly blown away by optimizations the compiler pulls out of its hat. Use `const` whenever you can and crank up the optimization levels and you'll see marvels. – Quentin Apr 23 '17 at 17:01
2

Yes. Declaring a variable inside a loop will cause it to be deconstructed and reconstructed on every iteration. This might not be noticeable with small loops and simple data types, which the compiler would optimize anyways, however when working with complex objects and large loops it is best to declare the variables outside.

If the variables for a loop use too much memory you can enclose the loop and the declarations in braces, causing all variables allocated inside the braces to be deleted after exiting. Mostly such micro-optimizations would not matter, but if you're using complex classes and such, just use initialize the variable outside and reset it every time.

Generally it is not a good idea to declare too many variables, it makes your code hard to read and increases memory usage. If you can, don't declare variables when you don't need to. Your example can be simplified to for(int i = 0;i<1000000;i++)cout<<i+100;, for example. If such optimizations are possible and they do not make your code hard to read, use them.

  • 1
    *"Your example can be simplified to `for(int i = 100;y<1000100;y++){cout< – cdhowie Apr 23 '17 at 16:56
  • @cdhowie I said that compilers optimize these loops normally, and that you should only worry about this if your loops are very complex. –  Apr 23 '17 at 16:57
  • 2
    If variables make your code *harder* to read, you're probably misusing/misnaming them. For example, folding the +100 into the loop header here muddles the number of iterations with the offset of the numbers. IMO this is actually less clear. – Quentin Apr 23 '17 at 16:57
  • @Quentin You're right. I modified the answer. –  Apr 23 '17 at 16:59
  • `for(int i = 0;y<1000000;y++)` -- IMO initializing a variable in loop header that isn't used in loop condition still looks confusing. – zett42 Apr 23 '17 at 17:14
  • @zett42 Sorry, that was a typo. Fixed. –  Apr 23 '17 at 17:15
1

Destroying an int is a noop. The variable ceases to exist, but no runtime code need be run.

References or pointers to variables that cease to exist have undefined behaviour. Prior to initialization, newly created local variables have undefined state. So simply reusing an old variable for a new one is legal, the compiler doesn't have to prove there are no such outstanding references.

In this case, if it can prove that the value was constant 100, it can even skip everything except the first initialization. And it can do this initialization "early" as there is no defined way to detect it happening early. In this case it is easy, and most compilers will do it easily; in more complex cases, less so. If you mark it const, the compiler no longer has to prove it was unmodified, but rather can assume it!

Many of the areas that C++ leaves undefined exist in order to make certain optimizations easy.

Now, if you had something more complex, like a vector<int>{1,2,3,4,5}, destruction and creation becomes less of a noop. It still becomes possible to 'hoist' the variable out of the loop, but much harder for the compiler. This is because dynamic allocation is a bit hard to optimize out sometimes.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524