88

I heard (probably from a teacher) that one should declare all variables on top of the program/function, and that declaring new ones among the statements could cause problems.

But then I was reading K&R and I came across this sentence: "Declarations of variables (including initializations) may follow the left brace that introduces any compound statement, not just the one that begins a function". He follows with an example:

if (n > 0){
    int i;
    for (i=0;i<n;i++)
    ...
}

I played a bit with the concept, and it works even with arrays. For example:

int main(){
    int x = 0 ;

    while (x<10){
        if (x>5){
            int y[x];
            y[0] = 10;
            printf("%d %d\n",y[0],y[4]);
        }
        x++;
    }
}

So when exactly I am not allowed to declare variables? For example, what if my variable declaration is not right after the opening brace? Like here:

int main(){
    int x = 10;

    x++;
    printf("%d\n",x);

    int z = 6;
    printf("%d\n",z);
}

Could this cause trouble depending on the program/machine?

Daniel Scocco
  • 7,036
  • 13
  • 51
  • 78
  • 5
    `gcc` is pretty lax. You're using c99 variable length arrays and declarations. Compile with `gcc -std=c89 -pedantic` and you'll get yelled at. According to c99, though, all that's kosher. – Dave Dec 12 '11 at 12:23
  • 6
    The problem is that you've been reading K&R, which is outdated. – Lundin Apr 24 '15 at 11:04
  • 1
    @Lundin Is there an appropriate replacement for K&R?? There is nothing after the ANSI C edition, and the reader of this book can clearly read which standard it refers to – Brandin Sep 02 '15 at 18:02

7 Answers7

143

I also often hear that putting variables at the top of the function is the best way to do things, but I strongly disagree. I prefer to confine variables to the smallest scope possible so they have less chance to be misused and so I have less stuff filling up my mental space in each line on the program.

While all versions of C allow lexical block scope, where you can declare the variables depends of the version of the C standard that you are targeting:

C99 onwards or C++

Modern C compilers such as gcc and clang support the C99 and C11 standards, which allow you to declare a variable anywhere a statement could go. The variable's scope starts from the point of the declaration to the end of the block (next closing brace).

if( x < 10 ){
   printf("%d", 17);  // z is not in scope in this line
   int z = 42;
   printf("%d", z);   // z is in scope in this line
}

You can also declare variables inside for loop initializers. The variable will only exist only inside the loop.

for(int i=0; i<10; i++){
    printf("%d", i);
}

ANSI C (C90)

If you are targeting the older ANSI C standard, then you are limited to declaring variables immediately after an opening brace1.

This doesn't mean you have to declare all your variables at the top of your functions though. In C you can put a brace-delimited block anywhere a statement could go (not just after things like if or for) and you can use this to introduce new variable scopes. The following is the ANSI C version of the previous C99 examples:

if( x < 10 ){
   printf("%d", 17);  // z is not in scope in this line

   {
       int z = 42;
       printf("%d", z);   // z is in scope in this line
   }
}

{int i; for(i=0; i<10; i++){
    printf("%d", i);
}}

1 Note that if you are using gcc you need to pass the --pedantic flag to make it actually enforce the C90 standard and complain that the variables are declared in the wrong place. If you just use -std=c90 it makes gcc accept a superset of C90 which also allows the more flexible C99 variable declarations.

hugomg
  • 68,213
  • 24
  • 160
  • 246
  • 1
    "The variable's scope starts from the point of the declaration to the end of the block" - which, in case anyone wonders, doesn't mean manually creating a narrower block is useful/needed to make the compiler use stack space efficiently. I've seen this a couple of times, & it's a false inference from the false refrain that C is 'portable assembler'. Because (A) the variable might be allocated in a register, not on the stack, & (B) if a variable is on the stack but the compiler can see that you stop using it e.g. 10% of the way through a block, it can easily recycle that space for something else. – underscore_d Aug 10 '16 at 17:53
  • 5
    @underscore_d Keep in mind that people who want to save memory often deal with embedded systems, where one is either forced to stick to lower optimisation levels and / or older compiler versions due to certification and / or toolchain aspects. – class stacker Jun 25 '17 at 03:31
  • 2
    I don't know where you got the idea that declaring variables in the middle of a scope is just a "hack that effectively moves the declaration to the top". This is not the case and if you try to use a variable in one line and declare it in the next line you will get a "variable is undeclared" compilation error. – hugomg Dec 23 '17 at 15:20
2

missingno covers what ANSI C allows, but he doesn't address why your teachers told you to declare your variables at the top of your functions. Declaring variables in odd places can make your code harder to read, and that can cause bugs.

Take the following code as an example.

#include <stdio.h>

int main() {
    int i, j;
    i = 20;
    j = 30;

    printf("(1) i: %d, j: %d\n", i, j);

    {
        int i;
        i = 88;
        j = 99;
        printf("(2) i: %d, j: %d\n", i, j);
    }

    printf("(3) i: %d, j: %d\n", i, j);

    return 0;
}

As you can see, I've declared i twice. Well, to be more precise, I've declared two variables, both with the name i. You might think this would cause an error, but it doesn't, because the two i variables are in different scopes. You can see this more clearly when you look at the output of this function.

(1) i: 20, j: 30
(2) i: 88, j: 99
(3) i: 20, j: 99

First, we assign 20 and 30 to i and j respectively. Then, inside the curly braces, we assign 88 and 99. So, why then does the j keep its value, but i goes back to being 20 again? It's because of the two different i variables.

Between the inner set of curly braces the i variable with the value 20 is hidden and inaccessible, but since we have not declared a new j, we are still using the j from the outer scope. When we leave the inner set of curly braces, the i holding the value 88 goes away, and we again have access to the i with the value 20.

Sometimes this behavior is a good thing, other times, maybe not, but it should be clear that if you use this feature of C indiscriminately, you can really make your code confusing and hard to understand.

haydenmuhl
  • 5,998
  • 7
  • 27
  • 32
  • 33
    You made your code hard to read because you used the very same name for two variables, not because you declared variables not at the beginning of the function. Those are two different problems. I strongly disagree with the statement that declaring variables in other places make your code hard to read, I think the opposite is true. When writing code, if you declare the variable near when it's going to be used, following the principle of temporal and spacial locality, when reading, you will be able to identify what it does, why is there and how it is used very easy. – Havok Jan 03 '14 at 20:47
  • 3
    As a rule of thumb, I declare all variables that are used at several times in the block at the beginning of the block. Some temp variable that is just for a local calculation somewhere, I tend to declare where it is used, as it is of no interest outside that snippet. – Lundin Apr 24 '15 at 11:28
  • 7
    Declaring a variable where it's needed, not necessarily at the top of a block, often lets you initialize it. Rather than `{ int n; /* computations ... */ n = some_value; }` you can write `{ /* computations ... */ const int n = some_value; }`. – Keith Thompson Jul 29 '16 at 14:53
  • @Havok "you used the very same name for two variables" also known as "shadowed variables" (`man gcc` then search for `-Wshadow`). so ya i agree Shadowed variables is demonstrated here. – Trevor Boyd Smith May 22 '20 at 19:24
1

A post shows the following code:

//C99
printf("%d", 17);
int z=42;
printf("%d", z);

//ANSI C
printf("%d", 17);
{
    int z=42;
    printf("%d", z);
}

and I think the implication is that these are equivalent. They are not. If int z is placed at the bottom of this code snippet, it causes a redefinition error against the first z definition but not against the second.

However, multiple successive lines of

//C99
for(int i=0; i<10; i++){}

compile just fine, despite them all declaring an int i variable. Showing the subtlety of this C99 rule.

Personally, I passionately shun this C99 feature.

The argument that it narrows the scope of a variable is false, as shown by these examples. Under the new rule, you cannot safely declare a variable until you have scanned the entire block, whereas formerly you only needed to understand what was going on at the head of each block.

Lover of Structure
  • 1,561
  • 3
  • 11
  • 27
  • 1
    Most other people who are willing to take responsibility for keeping track of their code welcome 'declare anywhere' with open arms due to the many benefits it opens to readability. And `for` is an irrelevant comparison – underscore_d Nov 20 '15 at 23:53
  • Its not as complicated as you make it sound. A variable's scope starts at its declaration and ends in the next `}`. Thats it! In the first example, if you want to add more lines that use `z` after the printf you would do it inside the code block, not outside it. You definitely do not need to "scan the entire block" to see if its OK to define a new variable. I do have to confess that the first snippet is a bit of an artificial example and I tend to avoid it because of the extra indentation it produces. However, the `{int i; for(..){ ... }}` pattern is something I do all the time. – hugomg Jul 28 '16 at 17:36
  • Your claim is inaccurate because in the second code snippt (ANSI C) you can't even put a second declaration of int z at the bottom of the ANSI C block because ANSI C only lets you put variable declarations at the top. So the error is different, but the result is the same. You can't put int z at the bottom of either of those code snippets. – RTHarston Sep 25 '20 at 21:26
  • Also, what is the problem with having multiple lines of that for loop? The int i only lives within the block of that for loop, so there is no leaking and no repeated definitions of the int i. – RTHarston Sep 25 '20 at 21:28
1

Internally all variables local to a function are allocated on a stack or inside CPU registers, and then the generated machine code swaps between the registers and the stack (called register spill), if compiler is bad or if CPU doesn't have enough registers to keep all the balls juggling in the air.

To allocate stuff on stack, CPU has two special registers, one called Stack Pointer (SP) and another -- Base Pointer (BP) or frame pointer (meaning the stack frame local to the current function scope). SP points inside the current location on a stack, while BP points to the working dataset (above it) and the function arguments (below it). When function is invoked, it pushes the BP of the caller/parent function onto the stack (pointed by SP), and sets the current SP as the new BP, then increases SP by the number of bytes spilled from registers onto stack, does computation, and on return, it restores its parent's BP, by poping it from the stack.

Generally, keeping your variables inside their own {}-scope could speedup compilation and improve the generated code by reducing the size of the graph the compiler has to walk to determine which variables are used where and how. In some cases (especially when goto is involved) compiler can miss the fact the variable wont be used anymore, unless you explicitly tell compiler its use scope. Compilers could have time/depth limit to search the program graph.

Compiler could place variables declared near each other to the same stack area, which means loading one will preload all other into cache. Same way, declaring variable register, could give compiler a hint that you want to avoid said variable being spilled on stack at all costs.

Strictly ANSI-standard-conforming C89 requires explicit { before declarations

Declarations of variables (including initializations) may follow the left brace that introduces any compound statement, not just the one that begins a function.

(K&R (2e), chapter 4 ("Functions and Program Structure"), p84; emphasis in the original), while extensions introduced by C++ and GCC allow declaring vars further into the body, which complicates goto and case statements. C++ further allows declaring stuff inside for loop initialization, which is limited to the scope of the loop.

Last but not least, for another human being reading your code, it would be overwhelming when he sees the top of a function littered with half a hundred variables declarations, instead of them localized at their use places. It also makes easier to comment out their use.

TLDR: using {} to explicitly state variables scope can help both compiler and human reader.

Lover of Structure
  • 1,561
  • 3
  • 11
  • 27
1

If your compiler allows it then its fine to declare anywhere you want. In fact the code is more readable (IMHO) when you declare the variable where you use instead of at the top of a function because it makes it easier to spot errors e.g. forgetting to initialize the variable or accidently hiding the variable.

AndersK
  • 35,813
  • 6
  • 60
  • 86
0

With clang and gcc, I encountered major issues with the following. gcc version 8.2.1 20181011 clang version 6.0.1

  {
    char f1[]="This_is_part1 This_is_part2";
    char f2[64]; char f3[64];
    sscanf(f1,"%s %s",f2,f3);      //split part1 to f2, part2 to f3 
  }

neither compiler liked f1,f2 or f3, to be within the block. I had to relocate f1,f2,f3 to the function definition area. the compiler did not mind the definition of an integer with the block.

  • Could you add a rudimentary function definition? When I read your solution, I thought your code was the whole program (that is: `main()`). I understand that the code fragments in the other answers don't all do this, but in your case the missing context (that is: other code around the block) hampers understanding. – Lover of Structure Feb 26 '23 at 01:54
-1

As per the The C Programming Language By K&R -

In C, all variables must be declared before they are used, usually at the beginning of the function before any executable statements.

Here you can see word usually it is not must..

  • 1
    These days, not all C is K&R - very little current code compiles with ancient K&R compilers, so why use that as your reference? – Toby Speight Sep 20 '17 at 09:35
  • The clarity and its ability to explain is awesome. I think its good to learn from original developers.Yes its ancient but it is good for beginners. – Gagandeep kaur Sep 20 '17 at 17:04
  • Your quoted statement is from chapter 1 ("A Tutorial Introduction") of K&R (2nd ed), p9. It is superseded by other text from chapter 4 ("Functions and Program Structure") of K&R (2nd ed), p84: "Declarations of variables (including initializations) may follow the left brace that introduces *any* compound statement, not just the one that begins a function." – Lover of Structure Feb 25 '23 at 16:30