3

Whenever I need to break out from a for(unsigned int i=0;i<bound;++i) expression in C++, I simply set the index variable i=bound, the same way as described in this answer. I tend to avoid the break statement because, honestly, I have no good understanding of what it actually does.

Compare the two instructions:

for(unsigned int i=0;i<bound;++i) {
    if (I need a break) {
    break;
    }
}

and

for(unsigned int i=0;i<bound;++i) {
    if (I need a break) {
    i=bound;
    }
}

I speculate that the second method does one extra variable set and then one extra comparison between i and bound, so it looks more expensive, from performance point of view. The question is then is it cheaper to call break, then doing these two tests? Are the compiled binaries any different? Is there any instance, where the second method breaks, or can I safely choose either of these two alternatives?


Related: Does `break` work only for `for`, `while`, `do-while`, `switch' and for `if` statements?

Breaking out of a loop without a break statement [C]

Community
  • 1
  • 1
Matsmath
  • 1,164
  • 2
  • 21
  • 40
  • 3
    "break" means that execution will jump to the first statement after the innermost loop – M.M May 05 '16 at 14:35
  • 7
    To answer your performance concerns, compare the generated assembly between the two versions (with optimization enabled) – M.M May 05 '16 at 14:36
  • 6
    The second approach seems dangerous as it will come a day when the loop limit variable will change from `bound` to `size` and you'll forget to replace the second `bound`. – atturri May 05 '16 at 14:41
  • 3
    "I have no good understanding of what it *actually* does." Well, that's curious, it's one of the most straightforward constructs out there. It just jumps immediately out of the loop. – Matteo Italia May 05 '16 at 14:45
  • 4
    1) use the construct that more clearly expresses intent (and learn the behavior of the `break` statement - it will likely be more clear to most programmers); 2) there is probably no performance difference of any consequence; 3) there might be a behavior difference (particularly if the loop control variable isn't local to the loop), be careful of this; 4) as others note, there might be consequences for future code maintenance – Michael Burr May 05 '16 at 15:13

5 Answers5

7

Using break will be more future proof and more logical.

Consider the following example,

for (i = 0; i < NUM_OF_ELEMENTS; i++)
{
     if(data[i] == expected_item)
         break;
} 

printf("\n Element %d is at index %d\n", expected_item, i);

But the second method won't be useful here.

Jeyaram
  • 9,158
  • 7
  • 41
  • 63
5

There are three main technical differences that come to mind:

  • as other have stated, if your index variable is not confined to the for scope break leaves it intact, while your method destroys its content; when you are searching e.g. an array with break the code is more concise (you don't have to keep an extra variable to write down where you stopped);
  • break quits the loop immediately; your method requires you to execute the rest of the body. Of course you can always write:

    for(int i=0; i<n; ++i) {
        if(...) {
            i=n;
        } else {
            rest of the loop body
        }
    }
    

    but it adds visual and logical clutter to your loop;

  • break is almost surely going to be translated to a simple jmp to the instruction just following the loop (although, if you have block-scoped variables with a destructor the situation may be more complicated); your solution is not necessarily recognized by the compiler as equivalent.

    You can actually see it here that gcc goes all the way to generate the code that moves n into i, while in the second case it jumps straight out of the loop.

On the stylistic side:

  • I find "your way" to be overly complicated and not idiomatic - if I encountered it in real code I would ask myself "why didn't he just use a break?", and then check twice to make sure that it's not like I'm missing some side effect and that the intent was actually just to jump out of the loop;
  • as other said, there's some risk of your inner assignment to go out of sync with the actual loop condition;
  • it doesn't scale when the loop condition becomes more complicated than a simple range check, both on the logic side (if the loop condition is complicated then tricking it can become more complicated) and on the performance side (if the loop condition is expensive and you already know you want to exit you don't want to check it again); this too can be circumvented by adding an extra variable (which is typically done in languages that lack break), but that's again extra distractions from what your algorithm is actually doing;
  • it doesn't work at all with range-based loops.
Matteo Italia
  • 123,740
  • 17
  • 206
  • 299
  • This assembly output is much appreciated. It certainly pushes me towards the right direction. – Matsmath May 05 '16 at 15:07
  • A better example is two functions that have the same behaviour, so they *could* be compiled to the same asm if the compiler "saw through" this odd idiom. I [moved the `ret += arr[i]` above the `if` in your example](https://godbolt.org/g/McK0X2). As it turns out gcc 6.1 uses `cmov` for the `if`, which increases the length of the loop-counter dependency chain from 1 cycle to 3 cycles (Intel pre-Broadwell) or 2 cycles (Broadwell and later). With no unrolling, the "normal" version can run at one per 2 cycles, but the `i=c` version can only run at one per 3 cycles (pre-Broadwell). – Peter Cordes May 06 '16 at 16:54
  • @PeterCordes: wops, that was definitely not intended; I'll fix the link ASAP. – Matteo Italia May 06 '16 at 20:24
  • Perfect example of why the `i=c` idiom is horrible! Already led to a bug in the trivial code in the answer talking about it. While you're updating your link, you should prob. leave out `-ansi`. That doesn't make sense if you're specifying `-std=c++11`. (I used `-xc -std=c11`, since Godbolt doesn't directly provide C compilers.) Also not needed: `-m32`. The 64bit ABI passes args in regs, so there's less noise. – Peter Cordes May 06 '16 at 20:38
3

I prefer break; because it leaves the loop variable intact.
I frequently use this form while searching for something:

int i;
for(i=0; i<list.size(); ++i)
{
    if (list[i] == target) // I found what I'm looking for!
    {
        break;  // Stop searching by ending the loop.
    }
}

if (i == list.size() ) // I still haven't found what I'm looking for -U2
{
    // Not found.
}
else
{
    // Do work with list[i]. 
}

Are the compiled binaries different?
Almost certainly yes.
(although an optimizer may recognize your pattern, and reduce them to nearly the same)
The break; statement will likely be an assembly "jump" statement to jump to the next instruction outside the list, while leaving the control variable unchanged.

Assigning the variable (in non-optimized code) will result in an assignment to the control variable, a test of that variable, and then a resulting jump to end the loop.

As others have mentioned, assigning the variable to its final value is less future-proof, in case your loop condition changes in the future.

In general, when you say:
"I have no good understanding of what it actually does. (so I use a workaround)",

I respond with:
"Take the time to learn what it does! A main aspect of your job as a programmer is to learn stuff."

abelenky
  • 63,815
  • 23
  • 109
  • 159
  • are you sure you can access `i` outside the loop? ( _I deleted my half-typed answer :-)_ ) – Sourav Ghosh May 05 '16 at 14:42
  • @SouravGhosh In the code above, he can't. But you can define i before the loop. int i = 0; for(i = 0... – Fred May 05 '16 at 14:43
  • @abelenky Right, but in case we follow OP's code, what's the point of this answer? (In general, you're very right, though) – Sourav Ghosh May 05 '16 at 14:45
  • I appreciate your answer. Although, it is not at all clear from your post whether the unconditional instruction jump is cheap, or expensive, from performance point of view. That is the stuff, I want to learn. – Matsmath May 05 '16 at 14:58
  • Do you think that `break;` would be included as a keyword in the language if it had a massive performance cost?? The cost of every single assembly instruction should be considered incredibly cheap. Expensive code comes from putting instructions together in really bad ways (eg. bubble-sort, massive memcopies, linear searches of sorted data, etc) Cost does not come from a single instruction. – abelenky May 05 '16 at 15:00
  • Such constructs make me want pythonic `for ... else` in C. – too honest for this site May 05 '16 at 15:05
3

Using break to do this is idiomatic and should be the default, unless for some reason the rather obfuscatory alternative serves to set the stage for logic below. Even then I'd prefer to do the variable setup after the loop exits, moving that setting closer to its usage for clarity.

I cannot conceive of a scenario where the performance matters enough to worry about it. Maybe a more convoluted example would demonstrate that. As noted the answer for that is almost always 'measure, then tune'.

Steve Townsend
  • 53,498
  • 9
  • 91
  • 140
1

In adition to the break statement to exit a for or [do] while loop, the use of goto is permitted to break out nested loops, e.g.:

    for (i=0; i<k; i++) {
        for (j=0; j<l; j++) {
            if (someCondition) {
                goto end_i;
            }
        }
    }
  end_i:
Paul Ogilvie
  • 25,048
  • 4
  • 23
  • 41
  • Indeed, and once I encountered some pretty nasty algorithm where there were no other (simple) way to do this, than relying on the good old `goto` call. That inspired me [this related question](http://stackoverflow.com/questions/36647765/how-to-transform-a-flow-chart-into-an-implementation). – Matsmath May 05 '16 at 14:54
  • @Matsmath, for the state machine in your reference I would write a C program to parse the state definition and emit a state machine in C. That that C code will be full of `goto` statements doesn't matter as it is machine generated (but you must prove the generator to be correct). – Paul Ogilvie May 05 '16 at 15:01
  • perl puts syntactic sugar on top of this: [you label the outer loop and use `last LABEL`](http://stackoverflow.com/questions/3708527/how-do-i-break-an-outer-loop-from-an-inner-one-in-perl). But anyway, @Matsmath: if you understand the concept of `goto`, then `break` is exactly equivalent to a `goto` to a label outside the loop. – Peter Cordes May 06 '16 at 17:01