22

While playing with the code I've noticed a strange behavior that I do not know to explain the logic behind

void foo(int n)
{
    int m = n;
    while (--n > 0)
    {
        switch (n)
        {
            case -1:
            case 0:
                for (int j = 0; j < m; ++j)
            default:
                    printf(":-)");
                break;
        }
    }
}

int main()
{
    foo(10);
    return 0;
}

I would expect the printf to execute let's say 10 times. Then I saw it continues to run (imagine 100000 instead of 10) and supposed that the developers (VS) interpreted the printf inside the for (pretty expected), hence the output is made n times for each entrance to switch.

But then turned out j was never initialized.

So my question is why ? Is this an undefined behavior? Is not this a, supposedly, standard code?

dEmigOd
  • 528
  • 4
  • 8
  • 11
    _"Is this an undefined behavior? Is not this a, supposedly, standard code?"_ Nope, the behavior is well known and even used at purpose with the famous _Duff's device_ for loop unrolling within a small range of values (looks slightly different though). – πάντα ῥεῖ Mar 08 '21 at 07:43
  • 10
    @πάνταῥεῖ Skipping `j`'s initialization leads to undefined behavior. – John Kugelman Mar 08 '21 at 07:44
  • @John Yeah, but generally mixing `switch` `case` labels with loops is OK, that's not the primary reason for UB here. – πάντα ῥεῖ Mar 08 '21 at 07:46
  • The reason your compiler is happy with the odd looking placement of `for` is that each `switch()` `case` does not define a separate block (scope). So as odd as it looks, it is legal.. – David C. Rankin Mar 08 '21 at 07:48
  • Protothreads might serve as a good example of how this technique can be utilized (with explanation in the docs). It's kinda hackish, but seems reasonable in case of bare-metal embedded devices: http://dunkels.com/adam/pt/expansion.html – alagner Mar 08 '21 at 08:08
  • It's worth noting that MSVC issues a warning that the initialization of j is skipped by 'default' label. As an interesting side note to that, pulling the declaration of j out into a separate statement: `int j;` `for (j = 0; j < m; ++j)` silences the warning, but does not make the problem go away. – dgnuff Mar 08 '21 at 08:53
  • @dgnuff, you are totally right. I've just ignored it. It even have a very detailed explanation. – dEmigOd Mar 08 '21 at 08:57
  • You ask about standards, but you tag this with both C and C++—two separate languages defined by two separate standards. Which set of standards are we supposed to hold this to? – KRyan Mar 08 '21 at 18:03
  • 1
    @alagner: "kinda hackish"? Wow, I used to write C code for Z-80 embedded devices where 32kb (kilobytes) was our large memory option (we shipped one device with a whopping 48kb once). I never got this hackish. Code should be written so that it's readable by the next human that needs to maintain it. – Flydog57 Mar 08 '21 at 20:08
  • "May the goto live forever!" ;-) Jumping inside a loop makes Nassi-Shneiderman happy! – U. Windl Mar 08 '21 at 21:37

5 Answers5

27

default is just a label (address to where the code jumps if n is not -1 or 0). Thus when n is not -1 or 0, the flow enters the body of the for loop skipping the initialisation of j. You can write the same code also as this, so it would be clearer what is happening here:

int m = n;
while (--n > 0)
{
    switch (n)
    {
        case -1:
        case 0:
            for (int j = 0; j < m; ++j)
            {
        default: printf(":-)");
            }
            break;
    }
}

(Note, as mentioned by @alagner in the comments, it won't compile with C++ compiler but perfectly compiles with C one so this is good enough to make my point and explain how the code looks).

So yes, since j is uninitialised, it is undefined behaviour. If you enable compiler warnings, it will warn you about it (https://godbolt.org/z/rzGraP):

warning: 'j' may be used uninitialized in this function [-Wmaybe-uninitialized]
   12 |                 for (int j = 0; j < m; ++j)
      |                          ^                  
Alex Lop.
  • 6,810
  • 1
  • 26
  • 45
  • 3
    Or even fails to build when compiled as C++. I'm no "language lawyer", but looks like it's the point when then C vs C++ subtleties start to manifest: https://godbolt.org/z/MhjxGG – alagner Mar 08 '21 at 07:58
  • @alagner You are right. C++ is much more strict in such cases. The OP tagged the question as C and C++ so I went with C. But I'll add your note to my answer. – Alex Lop. Mar 08 '21 at 08:01
11

A switch block is effectively a glorified set of goto statements. The different cases don't introduce scopes or any logical structure to the code. They're really just targets for the switch statement to jump to.

In this program, the default: label is inside a nested for loop. When the default case is hit the program jumps inside the loop as if there were a goto statement there. The switch block is equivalent to:

if (n == -1 || n == 0) {
    goto before_loop;
}
else {
    goto inside_loop;
}

before_loop:
for (int j = 0; j < m; ++j) {
    inside_loop:
    printf(":-)");
}

This is dangerous because jumping to inside_loop: skips over j = 0. As you've observed, j is still declared but it's not initialized, and accessing it leads to undefined behavior.

John Kugelman
  • 349,597
  • 67
  • 533
  • 578
7

As posted, the code has undefined behavior because when the switch jumps to the default: label, inside the for statement body, it skips the initialization of j in the inner loop, causing undefined behavior when j is tested and decremented when the loop iterates.

Skipping the initialization with a direct jump into a new scope is not allowed in C++. Such a constraint is not present in the C language, for compatibility with historical code where is does not necessarily cause problems, but modern compilers detect this error and complain. I recommend using -Wall -Wextra -Werror to avoid silly mistakes.

Note that modified as below, it becomes fully defined, prints :) 90 times (9 iterations of the outer loop, times 10 iterations of the inner loop) and finishes successfully:

#include <stdio.h>

void foo(int n) {
    int m = n;
    while (--n > 0) {
        int j = 0;
        switch (n) {
            case -1:
            case 0:
                for (j = 0; j < m; ++j)
            default:
                    printf(":-)");
                break;
        }
    }
}

int main() {
    foo(10);
    printf("\n");
    return 0;
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
5

For starters, the function foo having the return type int returns nothing.

The while loop:

while (--n > 0)
{
    //..
}

gets control only in the case when the value of the pre-decrement expression --n is greater than 0.

That is, within the while loop, the variable n is neither equal to 0 nor to -1.

So, control will be passed at once to the label default within the switch statement.

    switch (n)
    {
        case -1:
        case 0:
            for (int j = 0; j < m; ++j)
        default:
                printf(":-)");
            break;
    }

You may equivalently rewrite the while loop without the switch statement the following way to make it clearer:

while (--n > 0)
{
    goto Default;

    for (int j = 0; j < m; ++j)
    {
        Default: printf(":-)");
    } 
}

That is, control is at once passed inside the for loop. According to the C Standard (6.8.5 Iteration statements)

4 An iteration statement causes a statement called the loop body to be executed repeatedly until the controlling expression compares equal to 0. The repetition occurs regardless of whether the loop body is entered from the iteration statement or by a jump.

It means that the for loop will contain one statement:

printf(":-)");

will be executed.

However, the initial initialization of the variable j in the for loop is bypassed. From the C Standard (6.2.4 Storage durations of objects)

6 For such an object that does not have a variable length array type, its lifetime extends from entry into the block with which it is associated until execution of that block ends in any way. (Entering an enclosed block or calling a function suspends, but does not end, execution of the current block.) If the block is entered recursively, a new instance of the object is created each time. The initial value of the object is indeterminate. If an initialization is specified for the object, it is performed each time the declaration or compound literal is reached in the execution of the block; otherwise, the value becomes indeterminate each time the declaration is reached.

So, the variable j has indeterminate value. It means that the for loop and, as a result, the function itself, have undefined behavior.

Adrian Mole
  • 49,934
  • 160
  • 51
  • 83
Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
3

Lots of good explanations, but the key point is missing: that the compiler places a jmp instruction after the printf statement because it just compiled a for statement. The jmp jumps to the condition of the loop and continues there (using an uninitialized j).

Paul Ogilvie
  • 25,048
  • 4
  • 23
  • 41
  • Indeed, this was the key piece of information I was missing to understand the behavior. (Well, I guessed it from reading the other answers, but I'm surprised none of them thought to mention it.) – David Z Mar 08 '21 at 20:27
  • Those who can (display and understand) would view the assember code of the function. And a note: `j` is undefined even before the jump to the loop condition in the `default` case. – U. Windl Mar 08 '21 at 21:42