0

I came accross a code written by my teacher :

int i;
for(i = 0; i < 10; i++){
    //...
}
//...
for(i = 0; i < 10; i++){
    //...
}

Instead, I would have written :

for(int i = 0; i < 10; i++){
    //...
}
//...
for(int i = 0; i < 10; i++){
    //...
}

With my way to write it the memory used to store the variable i is freed between the two loops. I learned we had to use the smaller scope possible to enhance space complexity of our program.

I was wondering : does the int i operation itself (the variable declaration) uses memory? That's the only reason that would explain my teacher's way-to-go.

So in case variable declaration does have an actual memory cost, how does it evolve with larger scales? Is one of both ways always better than the other one or does it change with context?

dfrib
  • 70,367
  • 12
  • 127
  • 192
Axel Carré
  • 303
  • 1
  • 5
  • In what language? In C, there is no cost at all to using `int i` inside the `for` statement instead of before it; any reasonable compiler will optimize it to a single stack push, likely folded into establishing the stack frame for the enclosing routine. It may even be optimized not to use memory at all; `i` may be held entirely in a register or folded into other operations. – Eric Postpischil Nov 19 '20 at 16:00
  • declare the variable as close to where it's used as possible. Whether the variable uses memory or not depends on whether the compiler needs more register and must spill it onto memory or you. You'll never know that unless you read the output assembly – phuclv Nov 19 '20 at 16:03
  • 1
    It's C, but the same holds: [Is declaration of variables expensive?](/questions/27729930/is-declaration-of-variables-expensive) At least as long as there no expensive ctors and/or dtors involved. – Deduplicator Nov 19 '20 at 16:07
  • Even the dumbest compiler will generate the same code. – Sam Varshavchik Nov 19 '20 at 16:08
  • 1
    If you wonder why your teacher writes code this way, it's not because of optimization. It's probably because of compatibility - some compilers started supporting the other (C99) syntax much later, and had some bugs in the beginning. If your teacher remembers these times (long ago now), he/she probably gained that habit back then. Or maybe started programming when the new syntax wasn't available at all. – anatolyg Nov 19 '20 at 16:23
  • You very likely want to learn about [the _as-if rule_](https://stackoverflow.com/q/15718262/580083). With regard to this rule, this question almost does not make sense. The variable may even not at all exist at runtime (if the loop is unrolled) or will likely be mapped to a register. – Daniel Langr Nov 19 '20 at 16:25
  • 1
    You can use https://godbolt.org/ to test this yourself. On gcc 10, if you put both versions in their own function, the compiler actually only generates the assembly for one function because they are identical in behavior. [Example](https://godbolt.org/z/oe5M4h). Edit : Fixed example link. – François Andrieux Nov 19 '20 at 16:27

3 Answers3

4

Do not wonder too much on how the compiler will translate your source code! Optimizing compilers could build the same machine code for both versions.

What matters is whether the code is readable and robust. This is the reason why best practices recommend to use the smaller possible scope for a variable. If you try to use the variable by mistake outside of its scope, the compiler will immediately choke and will save you hours of debugging.

But neither the possibility of saving 4 bytes of memory, nor saving a variable allocation cost matters. At least for daily programming. It could only matter in low level optimizations, but this should only be considered for low resources embedded system or when profiling has made evident a bottleneck. Long story made short early low level optimization is bad practice.

cigien
  • 57,834
  • 11
  • 73
  • 112
Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
2

does the int i operation itself (the variable declaration) uses memory?

It will use 4 or 8 bytes of stack space (depending on whether int is 4 bytes or 8 bytes long on your host hardware). However, it will use that same amount of space regardless of whether you declare it above the for line or inside it. So from a memory-usage perspective it doesn't make a difference. (And note that allocating 4 or 8 bytes on the stack is practically zero-cost in terms of CPU cycles, as all it requires is increasing the value of the stack-pointer register by a constant)

From a code-correctness perspective, this approach is preferable:

for (int i=0; i<10; i++) {
   [...]
}

... because this way if you accidentally try to use the variable i after the end of the for-loop's body, it will be caught as a compile-time error (which is usually what you want, as the value of a for-loop variable isn't usually useful after the end of the for-loop)

That's the only reason that would explain my teacher's way-to-go.

Older versions of C didn't support the "declare the variable inside the for-line" syntax, so if your teacher learned to program a long time ago, he may be declaring his variable above the for-line out of habit from the days when that was the only option.

Jeremy Friesner
  • 70,199
  • 15
  • 131
  • 234
  • Thank you for explanations! – Axel Carré Nov 19 '20 at 16:20
  • 2
    Sorry, but I don't agree. It can take no stack space at all (and this likely will happen with optimizations) and be just mapped to a register. Or, even, the loop can be unrolled and there will be no `i` at all at runtime. Or, if the loop has no effect, it can be completely optimized away (corner case). The only thing that matters is the _as-if rule_. – Daniel Langr Nov 19 '20 at 16:23
  • @DanielLangr sure. Let's say, it can take *at most* 4 or 8 bytes of stack space. Of course the optimizer is free to optimize to make it take less than that, if it can. – Jeremy Friesner Nov 19 '20 at 19:08
1

The first C compilers required that automatic object declarations precede any other executable code within a function (see page 15 of https://www.bell-labs.com/usr/dmr/www/cman.pdf). Note that only objects of static duration were allowed to have initializers, and compound statements were not allowed to introduce new objects. These limitations greatly simplified single-pass compilation, and fit very well with Dennis Ritchie's objective that C be a simple language to compile.

By the time the first C Standard was published in 1989, most compilers had extended the language to allow any compound statement to start with automatic object declarations, and to allow automatic object declarations to include initializers. These additions would sometimes make it more difficult for a single pass compiler to generate efficient code, but it generally wasn't too hard to produce correct code. For example, given something like:

int test()
{
  int i,j;
  i=foo();
  j=bar();

a compiler would have been easily able to use one instruction to reserve stack space for both i and j, but if the code had been written as:

int test()
{
  int i=foo(),j=bar();

a single-pass compiler may have had to generate an instruction that allocates stack space for i, then calls foo, then an instruction that allocates stack space for j, and then calls bar. Not as efficient as what would have been produced given the old syntax, but adding the ability to allocate more space wasn't a particular problem.

Although C89 allowed some executable code to precede automatic object declarations, it required that any object declarations within a compound statement block precede any other statements within that block. The C99 standard relaxed this rule, and also allowed the first cause of a for loops to be an automatic object declaration rather than an expression. Although there are some machines for which C99 compilers are not available, the ability to declare a loop's index as part of a for loop is sufficiently useful that it's often desirable to write code that way.

The biggest downside to using that form is that if one has two loops within a block, and it becomes necessary to change one of them or the surrounding code so that the index object gets assigned the loop or examined after it, many compilers would squawk at a pattern like:

void test(void)
{
  for (int i=0; i<50; i++)
    doSomethingWith(i);

  int i;
  for (i=; i<50 && shouldntEarlyExit(i); i++)
    doSomethingWith(i);
  doSomethingWithWhatWoudlHaveBeenNextValueOf(i);
}

Although the Standard would allow the name i to be used both as the name of an automatic object whose scope is limited to the first loop, and also as an automatic variable whose scope is the enclosing block, many compilers will issue warning about such name reuse because it is often the result of programming mistakes. Thus, if it becomes necessary to rewrite any of the loops so that the index value is loaded before the loop or examined afterward, it may be necessary to remove the declaration from all loops using that name. Declaring the index at the start of the function rather than declaring it at the start of each loop will avoid the need to remove declarations from looping statements later.

supercat
  • 77,689
  • 9
  • 166
  • 211