Break-able named scopes in C/C++

Question

Introduction

This question follows from this one : The named loop idiom : dangerous?. For people who don't want to read the original question, it was about doing things like that :

named(label1) for(int i = 0 ; i < 10 ; i++) {
    for(int j = 0 ; j < 10 ; j++) {
      if(some_condition)
            break(label1); // exit outer loop
      }
}

This new question is about an improved version of the "named loop" idiom. If you're too lazy to read this entire post, you can directly go to the "Example" section of this post to clearly understand what I'm talking about.

Design flaws

Unfortunately, the question was closed very quickly (and later has been re-opened) because it was more a pros/cons debate than a pure technical question. It seems that it was not fitting the SO Q&A format. Moreover, the code I presented had several flaws :

The keyword break was redefined by a macro
The macro were written using lowercase letters

It made some horrible things compilable (at least using MSVC) :

int foo() {

   named(label1) for(int i = 0 ; i < 10; i++)
   {
      if(some_condition)
      {
          break(label1);  // here it's ok, the behavior is obvious
      }   
   }

   break(label1); // it compiles fine without warning... but the behavior is pretty obscur!

}

It could break some good-looking code. For instance, the following is not compiling because of a scoping problem.

int foo() {

named(label1)  for(int i = 0 ; i < 10 ; i++)
       named(label2)  for(int j = 0 ; j < 10 ; j++)
        if(i*j<15)
            cout << i*j << endl;
        else
            break(label2);
 }

A safer implementation

I tried to fix all these problems in order to get a safe version of named loops. More generally, it could also be called breakable scopes because it can be used to exit prematurely any scope, not only loops.

Here is the definition of the two macros NAMED and BREAK :

    #define NAMED(bn)   if(const bool _YOU_CANNOT_USE_NAMED_BREAK_IF_YOU_ARE_OUTSIDE_##bn##_ = true)\
                            goto _named_loop_identifier__##bn##_;\
                        else\
                            _break_the_loop_labelled_##bn##_:\
                            if(true)\
                                {}\
                            else\
                                if(! _YOU_CANNOT_USE_NAMED_BREAK_IF_YOU_ARE_OUTSIDE_##bn##_)\
                                    goto _break_the_loop_labelled_##bn##_;\
                                else\
                                    _named_loop_identifier__##bn##_:\


    #define BREAK(bn) if(!_YOU_CANNOT_USE_NAMED_BREAK_IF_YOU_ARE_OUTSIDE_##bn##_){} \
                        else goto _break_the_loop_labelled_##bn##_

It looks ugly & cumbersome because it also avoids some warnings that could be generated by MSVC or GCC like 'unused variable', 'unreferenced label' or 'suggest explicit braces'. Moreover, it should not compile if used incorrectly, and in that case the error message will be understandable. For instance :

    NAMED(loop1) for(int i = 0 ; i < 10; i++) {
        NAMED(loop2) for(int j = 0 ; j < i ; j++) {
            cout << i << "," << j << endl;
            if(j == 5) {
                BREAK(loop1);   // this one is okay, we exit the outer loop
            }
        }
        BREAK(loop2); // we're outside loop2, this is an error
    }

The previous code will not compile, and the compiler error message for the second BREAK : '_YOU_CANNOT_USE_NAMED_BREAK_IF_YOU_ARE_OUTSIDE_loop2_` which is very explicit.

Example

Before asking my question, I provide two examples to illustrate the (relative) usefulness of those constructs :

break an outer loop :

     NAMED(myloop) for(int i = 0 ; i < rows; i++) {
         for(int j = 0 ; j < cols ; j++) {
            if(some_condition) {
                 BREAK(myloop);
            }
         }
     }

Exit a specific scope :

NAMED(myscope1) {
    cout<< "a";
    NAMED(myscope2)
    {
        cout << "b";
        NAMED(myscope3)
        {
            cout << "c";
            BREAK(myscope2);
            cout << "d";
        }
        cout << "e";
    }
    cout <<"f";
}

This code prints : abcf.

My question

Before defining what my question is, I must define what it isn't since I don't want to see my topic closed in 10 minutes.

It is not "Is it a good idea?", nor "What do you think?" or even "Is it useful?", since stackoverflow doesn't seem to be a discussion place. Anyway, I already know the answers : "Macro are evils", "Goto's are evil" & "Use lambdas instead".

Instead, my question is about technical correctness & robustness against programming errors. I want this construct to be as safe as possible.

Is it possible for an user to misuse it and still be able to compile ? I tried to fix the obvious problems of the original implementation, but C++ is very complex and maybe I missed something ?
Could it silently break some good looking code ? This is my main concern. Can it interfere with the other C++ features (exception handling, destructor calls, others conditionals statements, or anything else...) ?

My objective is to demonstrate that this construct is not intrinsically dangerous. I already know that it would be a very bad idea to use it in real code since others programmers may not clearly understand it, but is it safe-enough to be used in a personal project ?

EDIT : The boolean variable is now const (thanks to Jens Gustedt). I'll try to replace the ifs by fors later to check it can removes spurious warning when used like that :

if(true)
     BREAK(label);

EDIT2: As noticed by JensGustedt, variable declaration in an if statement is not allowed in C (C++ only). Another reason to replace it by a loop.

This is very clever, but honestly I think you would be better off just using `goto`. — zwol, Jun 17 '12 at 14:55
Everyone knows `goto`. People who read this will probably sit down and try to figure out what the macros are doing, and then look at the code where they're used. This wouldn't be a great use of time when first reading your source code. — chris, Jun 17 '12 at 15:09
@Zack : The goal of this construct is to avoid using `goto` directly, and provide a safer syntax for exiting nested scopes. Here the label is automatically created to jump outside the scope, reducing the risk of error. — Frédéric Terrazzoni, Jun 17 '12 at 15:11
@chris : I totally agree with you, that's why I precised it is not aimed to be used in real code. I like this syntax and I want to use it in my personal project. Nobody else is supposed to read my code. I only want to be sure it's doesn't have a hidden design flaw that could break my code silently in some unusual situations (could be related to exception handling, destructor calls, etc...) — Frédéric Terrazzoni, Jun 17 '12 at 15:14
@FrédéricTerrazzoni, If it's only you, then I suppose it might be a bit better. I'm not sure if it will mess anything up, but it's too bad this wasn't implemented in C++ at all to be honest. I've found myself staring at that problem and having to do something about it more than once. — chris, Jun 17 '12 at 15:16
Questions of the form "can you see an error in my code?" are better-suited to http://codereview.stackexchange.com. — Oliver Charlesworth, Jun 17 '12 at 15:21
In your NAMED definition, is the 3rd `if` really necessary? its expression always evaluates to false doesn't it? **EDIT:** Ah, to avoid the unused label warning. OK. — Jesse Chisholm, Jun 17 '12 at 15:25
@OliCharlesworth : I didn't know about that website, thank you. I'll post there next time I've a code to check (don't want to double post it everywhere!) — Frédéric Terrazzoni, Jun 17 '12 at 15:26
When I see code that tries to do this, I instead try to refactor that double-loop into something else. I tend to refactor it as a separate method from which I can do an early return. Much cleaner than using `goto` among several other benefits. — Mike Bailey, Jun 17 '12 at 16:40
@chris: If it's "not aimed to be used in real code" and only meant for use in his personal project, I would say that this question doesn't belong here as Oli Charlesworth says. If this isn't meant to be used by others, this is way too localized and doesn't belong here. — Mike Bailey, Jun 17 '12 at 16:47
@MikeBantegui : Maybe some other people could be interested as well as I am. I'm not the only one who like to play with C/C++ features. — Frédéric Terrazzoni, Jun 17 '12 at 16:54
@FrédéricTerrazzoni: If that is the case, then please phrase your question in the format. Otherwise, this question sounds highly localized. I like playing with C++ as well, but statements such as "not to be used in real code" suggest this question is better suited on code review. — Mike Bailey, Jun 17 '12 at 17:37

Jesse Chisholm · Answer 1 · 2012-06-17T15:43:49.270

I presume you have tested this, but ... isn't the bool variable you declare in NAMED(bn) still available later in the same block the if statement it is declared in exists? (: Your idiom won't work unless it is so. :)

I can see this being safe:

{
    NAMED(one) { ... }
}
BREAK(one);

The compiler will fuss on the BREAK(nb); line

But this still looks unsafe:

{
    NAMED(bn) { ... }
    BREAK(bn);
}

Unsafe in the sense that the compiler may still accept the variable as defined and not fuss. But it could form an infinite loop, thus silently breaking your program.

-Jesse

PS: It will not interfere with try..finally as the finally block is defined to be executed regardless of how you exit the try block. So as long as you aren't trying to avoid the finally block, you are OK.

P(PS)S: the only really weird interaction with other constructs I see is this:

#if DEBUG
    NAMED(bn)
#endif
    while(true)
    {
        BREAK(BN);
    }

And that is pathological! ;-D That will compile fine in DEBUG but get the compiler error in RELEASE.

`BREAK` requires the boolean variable to be defined in its scope. The `NAMED` macro creates a new scope to store it, so in this case it won't compile and the compiler will throw `'_YOU_CANNOT_USE_NAMED_BREAK_IF_YOU_ARE_OUTSIDE_bn_' was not declared in this scope` — Frédéric Terrazzoni, Jun 17 '12 at 15:46
Of course if the NAMED is only defined in DEBUG, it won't work. That said, I prefer a compilation error than a runtime crash / undefined behavior :) — Frédéric Terrazzoni, Jun 17 '12 at 15:53

score 1 · Answer 2 · answered Jun 17 '12 at 15:31

1

I really can't answer for c++ aspects, but since you also labeled it C ...

Usually using if/else constructs for such a thing is not so good of an idea, since you will always find a good-meaning C compiler that will find a warning in place for bizarre else matchings and things like that.

I usually use for constructs like that for P99. They avoid dangling else (or spurious warnings about it). Also for is the only place in C that you may place local variables the way you need them, they aren't allowed in if or while as they would be in C++.

On the "safety" side, you'd probably should declare the variable register bool const instead of just bool. So nobody could try to change it or even take its address to change it behind your back.

But for the particular purpose that you are using it I am not too much fond of for or goto either. What you are really doing is unwinding the stack. In C there is one construct to do that properly, namely setjmp/longjmp. In C++ it probably would be try/catch.

answered Jun 17 '12 at 15:31

Jens Gustedt

76,821
6
102
177

I labelled C & C++ since it doesn't rely on any C++ specific feature. Regarding the `for` vs `if` choice, I think this is a very good suggestion since I got some spurious warning writing if(condition) BREAK(label); else { }. You second suggestion of defining the variable `register bool const` is also interesting to prevent the malicious user to hijack it. Thank you. – Frédéric Terrazzoni Jun 17 '12 at 15:39
I absolutely would not recommend exception semantics for scope exiting. Using try/catch for control flow is hideous and *bad*, simply because that's not what it is for. – Xeo Jun 17 '12 at 15:40
His macros are defined without dangling `else`s, so the compilers should all be happy there. Both macros emit complete statements. – Jesse Chisholm Jun 17 '12 at 15:47
1

@FrédéricTerrazzoni, yes it does rely on a specific C++ feature. Variable declarations inside the `if` clause is not possible in C, as I said. – Jens Gustedt Jun 17 '12 at 16:14
Speaking personally, I would rather use the macros than `setjmp()` and `longjmp()`. – Jonathan Leffler Jun 17 '12 at 16:21
@JensGustedt : Sorry I misread. I didn't know that it was C++ specific (it seemed so natural...). I'll edit my original post to precise it won't compile on a pure C compiler. – Frédéric Terrazzoni Jun 17 '12 at 16:25