32

I'm currently trying to figure out a way to break out of a for loop from within a function called in that loop. I'm aware of the possibility to just have the function return a value and then check against a particular value and then break, but I'd like to do it from within the function directly.

This is because I'm using an in-house library for a specific piece of hardware that mandates the function signature of my function to look like this:

void foo (int passV, int aVal, long bVal)

I'm aware that not using a return value is very bad practice, but alas circumstances force me to, so please bear with me.

Consider following example:

#include <stdio.h>

void foo (int a) {
    printf("a: %d", a);
    break;
}

int main(void) {
    for (int i = 0; i <= 100; i++) {
        foo(i);
    }
    return 0;
}

Now this does not compile. Instead, I get a compilation error as follows:

prog.c: In function 'foo': prog.c:6:2: error: break statement not within loop or switch break;

I know what this means (the compiler says that the break in foo() is not within a for loop)

Now, what I could find from the standard regarding the break statement is this:

The break statement causes control to pass to the statement following the innermost enclosing while, do, for, or switch statement. The syntax is simply break;

Considering my function is called from within a for loop, why doesn't the break statement break out of said for loop? Furthermore, is it possible to realise something like this without having the function return first?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Magisch
  • 7,312
  • 9
  • 36
  • 52
  • 2
    **innermost enclosing while, do, for, or switch statement** what it's find is the enclosing function, so error. – AchmadJP Feb 15 '16 at 08:08
  • 33
    Look up "longjmp". Or, which might be better advice, don't. – Thomas Padron-McCarthy Feb 15 '16 at 08:09
  • @ThomasPadron-McCarthy Im working with multiple proprietary functions in a larger project and im afraid I don't have a choice. – Magisch Feb 15 '16 at 08:11
  • 2
    What's wrong with putting `break` right after the call to `foo` ? – Jabberwocky Feb 15 '16 at 08:16
  • 5
    Using `longjmp()` would almost surely require much more work than adding a return result that can be checked. If you'd modify the function to add the `break`, why the resistance to modifying it to return a result? – Michael Burr Feb 15 '16 at 08:19
  • @MichaelWalz I need logic available only within foo to break out of it. foo is also mandated to return void by my project – Magisch Feb 15 '16 at 08:19
  • 6
    @Magisch the requirement of returning void sounds really strange to me. You should discuss this with the person responsanble for this somewhat questionable decision. – Jabberwocky Feb 15 '16 at 08:24
  • @MichaelWalz I know its bad coding practice, but unfortunately I cannot ask the person that wrote this as they already left and I have no way of modifying this – Magisch Feb 15 '16 at 08:24
  • 1
    @Magisch source code no more available ? OMG. – Jabberwocky Feb 15 '16 at 08:25
  • @MichaelWalz Alternative would be rewriting what once was supposedly a good 80k lines of code ;) – Magisch Feb 15 '16 at 08:26
  • Are you allowed to change the signature of `foo()`? If so you could pass the return value via pointers. See http://stackoverflow.com/questions/2229498/passing-by-reference-in-c for more information. – Tarok Feb 15 '16 at 11:04
  • 2
    Downvoted, since THIS is the best example for people insisting on using `goto` patterns, in the way that discredited `goto` to the reuputation it has today. I'm an advocate of using `goto` when there is a real benefit from it. And **THIS** makes me cry. – dhein Feb 15 '16 at 12:45
  • @Zaibis I don't want to code this way, but constraints in the in-house library and hardware driver im using force me to. – Magisch Feb 15 '16 at 12:51
  • 2
    What happens if you want to call `foo` from somewhere else in your code that isn't a loop? What would you expect the `break` statement to do there? – rhughes Feb 15 '16 at 14:04
  • @rhughes I would check against that not happening, but in the example, UB – Magisch Feb 15 '16 at 14:06
  • @Magisch OK, so you would have some code like: `void foo(bool isInLoop) { if( isInLoop) break; }` ? – rhughes Feb 15 '16 at 14:14
  • @rhughes Roughly. The implementation I went with uses a Longjump with a similar statement. Two of them, actually, and which one to use (if at all) is passed in a variable. – Magisch Feb 15 '16 at 14:15
  • 1
    @Magisch what you saying sounds inconsistent! given that you have no way of changing `foo()`'s structure while it is part of code which you aren't able (by effort) to rewrite. Assumption: you can't know your "structure" will internally be called. Given this is the case, it might happen `foo()` will be called without `setjmp` beeing invoked before. What is UB. So actually you say the unit tests are fine. but realize: If my assumption is correct, you are risking **UB! DONT DO THIS!** While should I assume wrong, than you should just rewrite the code instead. So w/e: **THIS** approach is wrong. – dhein Feb 15 '16 at 14:32
  • @Zaibis Im not sure what exactly causes the constraint, what im saying is directly from the doc of the library. I assume its because its called from assembler code (there is ALOT of interweaved assembler making up the library) and expects a certain function structure. I don't even have access to all the assembler code so I cannot dig through to test. So far the unit tests I've done look promising though, and considering how much of a clustertruck this process already is, I may not have a choice to do it differently. – Magisch Feb 15 '16 at 14:39
  • @Zaibis Completly elaborating all constraints would probably break SO format by quite a bit, so im trying to be reductive here. The question also explicitly states that the code above is wrong/does not compile, so I fail to see how its bad that above code is incorrect. – Magisch Feb 15 '16 at 14:39
  • @Zaibis Basicly, I am able to set a jump and use that jump, and I am able to determine wether or not it should use that jump. The function is called from within the library, too, and so thats why there is a mandated signature. Presently im just happy that it works and will do more extensive testing. – Magisch Feb 15 '16 at 14:42
  • @Magisch: I'm just saying, you are using a structure that you can't change.This means your own code is depending on the lib code. This if you are coding well, just can be caused, because your function will be somway used internally. And as you just admited clearly in your last comment, you can't fully understand this internal code. So you can't exclude the case that your function(`foo()`) might be called internally. and should this happen without an previous `setjmp` (what probably isn't respected by the lib) you will cause undefined behavior. and thats why I try to say you shouldn't use this. – dhein Feb 15 '16 at 14:45
  • @Magisch: the problem with undefined behavior is that just testing won't help, since it might go many times right but just some days starts behaving diferent for no reason. – dhein Feb 15 '16 at 14:46
  • @Zaibis Presently the jump instruction only happens when I pass a very specific value as parameter, one that should be reasonably never passed by the library, if that happens, I guess im just boned. But dems the breaks in working with partially available badly made code in a language you dont understand by another apprentice 10 years ago. – Magisch Feb 15 '16 at 14:47
  • 1
    @Zaibis Its a sad fact of the situation, but I've had to use many things that would make any good programmer vomit already (like upwards of 40 global variables) – Magisch Feb 15 '16 at 14:48
  • 1
    Your should also state whether there is any chance of your code having to be thread-safe, and perhaps whether you call it directly or indirectly, e.g. is it in reality a call-back function that you pass into a library. – PJTraill Feb 15 '16 at 14:49
  • @PJTraill I must admit I cannot state any of the above with certainty as I only have part of the source code of the library and some documentation written by an apprentice 10 years ago. Its a mine field, I know. I kept the question very general on purpose to avoid trying to delve into an unsolveable highly specific mess of code that would make legit programmers vomit. – Magisch Feb 15 '16 at 14:50
  • 1
    @Magisch You could always use a global variable to fake a return value here, and make it 41. – user253751 Feb 16 '16 at 09:36

14 Answers14

46

You cannot use break; this way, it must appear inside the body of the for loop.

There are several ways to do this, but neither is recommended:

  • you can exit the program with the exit() function. Since the loop is run from main() and you do not do anything after it, it is possible to achieve what you want this way, but it as a special case.

  • You can set a global variable in the function and test that in the for loop after the function call. Using global variables is generally not recommended practice.

  • you can use setjmp() and longjmp(), but it is like trying to squash a fly with a hammer, you may break other things and miss the fly altogether. I would not recommend this approach. Furthermore, it requires a jmpbuf that you will have to pass to the function or access as a global variable.

An acceptable alternative is to pass the address of a status variable as an extra argument: the function can set it to indicate the need to break from the loop.

But by far the best approach in C is returning a value to test for continuation, it is the most readable.

From your explanations, you don't have the source code for foo() but can detect some conditions in a function that you can modify called directly or indirectly by foo(): longjmp() will jump from its location, deep inside the internals of foo(), possibly many levels down the call stack, to the setjmp() location, bypassing regular function exit code for all intermediary calls. If that's precisely what you need to do to avoid a crash, setjmp() / longjmp() is a solution, but it may cause other problems such as resource leakage, missing initialization, inconsistent state and other sources of undefined behavior.

Note that your for loop will iterate 101 times because you use the <= operator. The idiomatic for loop uses for (int i = 0; i < 100; i++) to iterate exactly the number of times that appears as the upper (excluded) bound.

chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • 6
    It should be emphasized that evaluating a return value is absolutely the way to go, as it yields best readability. – rexkogitans Feb 15 '16 at 14:29
25

break, like goto, can only jump locally within the same function, but if you absolutely have to, you can use setjmp and longjmp:

#include <stdio.h>
#include <setjmp.h>

jmp_buf jump_target;

void foo(void)
{
    printf("Inside foo!\n");
    longjmp(jump_target, 1);
    printf("Still inside foo!\n");
}

int main(void) {
    if (setjmp(jump_target) == 0)
        foo();
    else
        printf("Jumped out!\n");
    return 0;
}

The call to longjmp will cause a jump back to the setjmp call. The return value from setjmp shows if it is returning after setting the jump target, or if it is returning from a jump.

Output:

Inside foo!
Jumped out!

Nonlocal jumps are safe when used correctly, but there are a number of things to think carefully about:

  • Since longjmp jumps "through" all the function activations between the setjmp call and the longjmp call, if any of those functions expect to be able to do additional work after the current place in execution, that work will simply not be done.
  • If the function activation that called setjmp has terminated, the behaviour is undefined. Anything can happen.
  • If setjmp hasn't yet been called, then jump_target is not set, and the behaviour is undefined.
  • Local variables in the function that called setjmp can under certain conditions have undefined values.
  • One word: threads.
  • Other things, such as that floating-point status flags might not be retained, and that there are restrictions on where you can put the setjmp call.

Most of these follow naturally if you have a good understanding of what a nonlocal jump does on the level of machine instructions and CPU registers, but unless you have that, and have read what the C standard does and does not guarantee, I would advise some caution.

Thomas Padron-McCarthy
  • 27,232
  • 8
  • 51
  • 75
  • In this example, where is longjmp jumping to and how can I set that? – Magisch Feb 15 '16 at 08:22
  • `longjmp` jumps back to the `setjmp` call. The return value shows if it is returning after setting the jump target, or if it is returning from a `longjmp` jump. – Thomas Padron-McCarthy Feb 15 '16 at 08:25
  • Oooh nice, this looks like exactly what I wanted :) Thank you very much – Magisch Feb 15 '16 at 08:26
  • 2
    @Magisch: I'm curious how `setjmp`/`longjmp` would reduce the 'rewrite' necessary as compared to returning a vale to be checked? Using `setjmp`/`longjmp` will require both the call site and the function to be modified. Adding a return value is the same in many cases - except that only call sites that need to be able to perform the `break` need to be modified. Other call sites can just continue to call the function and ignore the return value. – Michael Burr Feb 15 '16 at 08:44
  • @MichaelBurr In a different part of the code that I cannot modify yet have to run regardless is a (very badly made) check that any functions I write for this project can only return void. I would slap the guy that wrote this but he's no longer here. – Magisch Feb 15 '16 at 08:54
  • 2
    @Magisch I still don't understand why you can't modify that badly written part to return something else than void, but you _can_ modify it introducing the setjmp/longjmp fuzz. – Jabberwocky Feb 15 '16 at 08:59
  • @MichaelWalz I can modify an earlier part, but not the part where it checks the function against it. I do have the source code for the other parts, but im not allowed to modify it (dont ask me why, it has to do with that the other guy left a while ago). – Magisch Feb 15 '16 at 09:00
  • 1
    @Magisch but you still need to modify that part for `setjmp`. – Jabberwocky Feb 15 '16 at 09:03
  • That part, yes. But the function will be called (and checked against the fact that it has to be void) later in the code that I cannot modify. I will use the input value to the function to determine wether or not to jump, since I suppose that you can't make a int returning function pretend it returns void. – Magisch Feb 15 '16 at 09:09
  • 1
    @Magisch "I suppose that you can't make a int returning function pretend it returns void". Yes you can. You can call a function returning an `int` exactly as if it was a `void` function. In that case the return value will simply be ignored. – Jabberwocky Feb 15 '16 at 10:33
  • @MichaelWalz Even if the code calling it assumes on every call that it is indeed a void? Can you concieve of an example where a non-returning function returning something could cause a crash? – Magisch Feb 15 '16 at 10:43
  • 2
    @Magisch you are using this everyday each time you call [`printf`](http://www.cplusplus.com/reference/cstdio/printf/) which returns an `int` (total number of characters written) but which is ignored almost all the time. You will _never_ get a crash by ignoring the return value of a function. This behaviour is perfectly defined. – Jabberwocky Feb 15 '16 at 10:47
  • @MichaelWalz: On the source code level, yes, you can just ignore the returned value. But as far as I have understood it this concerns compiled code inside the library he is using? – Thomas Padron-McCarthy Feb 15 '16 at 10:48
  • @ThomasPadron-McCarthy Yes, a library which I have limited source access to and which I'm not allowed to modify. So I really have no idea but in the documentation it says specifically that the function of that name (the one im writing) has to return void. Im not sure what kind of weird tomfoolery happens in this, as it has a lot of interweaved assembler within (its a hardware driver lib for a specialised piece of hardware). – Magisch Feb 15 '16 at 10:50
  • 10
    @Magisch: beware that you might still be breaking the code despite the fact that you've worked around the explicit requirement for a void-return. If your function "must return void", then your former colleague's code might in particular have been written on the assumption that it "must return" at all. But you've now written a function that doesn't return, it jumps out instead. Check very carefully that your colleague didn't write clean-up code which you're now jumping past, because if so then skipping it likely will cause resource leaks, or leave some data structures in an inconsistent state. – Steve Jessop Feb 15 '16 at 10:53
  • 4
    @Magisch: Especially having read your latest comment, I would want to be a bit careful stomping around with longjumps in this program. But if it's the only way, and careful testing fails to find any problems, then perhaps you can cross your fingers and hope for the best. Just tell me which type of airplane this will be installed in, so I know which flights to avoid... – Thomas Padron-McCarthy Feb 15 '16 at 10:53
  • @ThomasPadron-McCarthy So far unit tests I've run using the longjump have worked properly. When I changed the function to return int and tested against that I got a segfault on the end of the first call to it. – Magisch Feb 15 '16 at 10:58
  • @Magisch: that's plausible, reasons I can think of that would happen include that the "assembler tomfoolery" makes a call direct to the C function from assembly (and therefore requires it to have a particular return type) or else that somewhere along the line a pointer to the function is reinterpreted as a pointer to a function of particular signature (and therefore a call to it from C via that function pointer requires it to have the right return type). – Steve Jessop Feb 15 '16 at 11:03
  • @SteveJessop I don't have the complete assembler source code and I don't understand enough of it to confirm your suspicions but its working properly so far with the longjump function. Memory leak also seems to be not happening. – Magisch Feb 15 '16 at 11:43
  • 23
    Might be worth adding a big "This is usually a very bad idea" warning on top of this answer. – Ixrec Feb 15 '16 at 13:20
  • 2
    This is a terrible, terrible idea -1 – Jack Aidley Feb 15 '16 at 14:14
  • @ThomasPadron-McCarthy If you care, its a specialised type of barcode scanner with integrated data processing on our end. – Magisch Feb 15 '16 at 14:23
  • @ThomasPadron-McCarthy: You shuould also add a note that performing `longjmp` without having set `setjmp` is undefined behavior. So this is anyway not the goal you want to set, in cases you handle with given code structures you are not aware of where else they might be unaware get called, are you with me? – dhein Feb 15 '16 at 14:37
  • 1
    @Ixrec: Well, I wouldn't go as far as to say that it's "usually a very bad idea". It can be, sometimes, but I've used longjmp with good results in some of my programs. (Granted, I didn't jump through other code that I didn't have control over.) setjmp/longjmp is part of C, with well-defined behaviour. Unless you do it wrong. Then you'd better have bullet-proofed your feet, as usual with C... – Thomas Padron-McCarthy Feb 15 '16 at 16:17
  • @ThomasPadron-McCarthy: Is the second part of your last comment addressed to me? If so, yeah it is part of it and there is well-defined behavior if used correctly. But note that OP is expressing he has to hand this structure to the library part. means he hasn't absolute controll of when is function maybe gets called. And therefor this answer fits in the cathegory "Unless you do it wrong." So after ignoring my last comment jsut to notify you: this is worth dowvoting to me, and I will do so untill you add this informational. – dhein Feb 16 '16 at 10:19
  • @Zaibis: Sorry for ignoring you. But I've added some warnings now. – Thomas Padron-McCarthy Feb 16 '16 at 12:00
  • @Zaibis I know its a dirty fix that has potential pitfalls and could blow up on me, but thats what im looking for currently. – Magisch Feb 16 '16 at 14:20
9

(Note: the question has been edited since I originally wrote this)

Because of the way C is compiled it must know where to break to when the function is called. Since you can call it from anywhere, or even somewhere a break makes no sense, you cannot have a break; statement in your function and have it work like this.

Other answers have suggested terrible solutions such as setting a global variable, using a #define or longjumping(!) out of the function. These are extremely poor solutions. Instead, you should use the solution you wrongly dismiss in your opening paragraph and return a value from your function that indicates the state that you want to trigger a break in this case and do something like this:

#include <stdbool.h>

bool checkAndDisplay(int n)
{
    printf("%d\n", n);
    return (n == 14);
}

int main(void) {
    for (int i = 0; i <= 100; i++) {
        if (checkAndDisplay(i))
            break;
    }
    return 0;
}

Trying to find obscure ways to achieve things like this instead of using the correct way to achieve the same end result is a surefire way to generate dire quality code that is a nightmare to maintain and debug.

You mention, hidden in a comment, that you must use a void return, this is not a problem, just pass the break parameter in as a pointer:

#include <stdbool.h>

void checkAndDisplay(int n, bool* wantBreak)
{
    printf("%d\n", n);
    if (n == 14)
        wantBreak = true;
}

int main(void) {
    bool wantBreak = false;
    for (int i = 0; i <= 100; i++) {
        checkAndDisplay(i, &wantBreak);
        if (wantBreak)
            break;
    }
    return 0;
}

Since your parameters are fixed type I suggest you use a cast to pass in the pointer to one of the parameters, e.g. foo(a, b, (long)&out);

Jack Aidley
  • 19,439
  • 7
  • 43
  • 70
  • Hardware and software constraints force me to not use a return value. Its clearly listed as a constraint in my question, so why do you suggest I ignore a software constraint? That wont work for me. – Magisch Feb 15 '16 at 14:12
  • In specific my project uses a inhouse driver/library on specific hardware that has a special constraint stating the function I write _has_ to return void, if you read the comment discussion in the accepted answer you'll get to know exactly why. – Magisch Feb 15 '16 at 14:13
  • 1
    It's helpful if you include the constraint in your question rather than hiding it in a comment. I have added a section about having a void return. – Jack Aidley Feb 15 '16 at 14:15
  • 2
    I specifically discuss that I want an alternative to using return values in my question, is that not enough? If you want to get hyper-specific to my use case (not helpful for future visitors imo but ok) the function is mandated by the library to have the format `void foo (int passV, int aVal, long bVal)` The doc for the library specifically mentions this and not using it makes the program crash with a segfault. I've tried it. – Magisch Feb 15 '16 at 14:16
  • 1
    No, because you don't state it is a requirement; it just sounds like you're ignorant. – Jack Aidley Feb 15 '16 at 14:17
  • Alright, I included the specific constraint in the question. – Magisch Feb 15 '16 at 14:20
  • Do you require all three arguments for parameters you're passing in? If not, I would use a cast'd pointer as one of the arguments so you can use it as a return value, e.g. `foo(a, b,(long)&out)`. – Jack Aidley Feb 15 '16 at 14:31
  • 1
    The data types are mandated. Are you suggesting I misuse one of my parameters to pass a pointer instead? I may be able to do that if I work two of my parameters into one using a bit of bitshifting and then untangle them from within the function. – Magisch Feb 15 '16 at 14:33
  • Yes, that's exactly what I'm suggesting - sorry, I made a typo in my comment (now corrected). – Jack Aidley Feb 15 '16 at 14:36
  • 2
    I still think a static global variable will be much easier than tangling pointers... – Falco Feb 15 '16 at 14:40
  • @Falco Im not sure I can do that without alot of additional overhead (declaring it across multiple files) and in the program itself I already have about 40 global variables. I know its a mess. – Magisch Feb 15 '16 at 15:03
  • I don't see the problem with multiple files? If you declare the variable in the same file as the function, it should be accessible in the same way as your function is accessible... And the only place where you need to access it is right after the function call `if (function_x_result) break;` – Falco Feb 15 '16 at 15:15
  • While I'll accept that occasional global variables are required, using them for this kind of function specific avoidance strikes me as a bad practice. You can swizzle/unswizzle the arguments in a wrapping function and so, for the caller, it could be completely transparent. – Jack Aidley Feb 15 '16 at 16:00
  • You seriously think mangling a pointer to `long` is a better idea than using a standard, *defined* part of the language like `setjmp`? – Alex Celeste Feb 15 '16 at 23:34
  • @Magisch You wouldn't need to swizzle parameters, surely? Just make it a pointer to a `struct`. (You'd still need the platform-specific pointer-to-long conversion) – user253751 Feb 16 '16 at 09:41
  • 1
    @Leushenko: It's not undefined behaviour, providing you ensure the pointer type is large enough to hold the resulting value (C 2011 standard, 6.3.2.3). Since Magisch is writing code for a specified platform this is not a concern. – Jack Aidley Feb 16 '16 at 10:06
8

break is statement which is resolved during compile time. Therefore the compiler must find appropriate for/while loop within the same function. Note that there is no guarantee that the function couldn't be called from somewhere else.

Zbynek Vyskovsky - kvr000
  • 18,186
  • 3
  • 35
  • 43
  • Is it possible to emulate this functionality somehow? (Cancelling the for loop from within another function)? – Magisch Feb 15 '16 at 08:10
  • @Magisch : You can for example return bool/int to indicate that you want to break and then invoke break from the for loop according to return value. – Zbynek Vyskovsky - kvr000 Feb 15 '16 at 08:11
  • Im afraid I can't use a return value for this. Is there a different method? – Magisch Feb 15 '16 at 08:13
  • @Magisch : Well, longjmp is an option but requires special handling and is really expensive. In C++ you can use exceptions but they're quite expensive as well. So rather think about how to reorganize the code. Any solution will require special handling for both the caller and callee so return value (either real return or via pointer) is still best option. – Zbynek Vyskovsky - kvr000 Feb 15 '16 at 08:17
6

If you cannot use the break instruction you could define a local variable in your module and add a second run condition to the for loop. For example like the following code:

#include <stdio.h>
#include <stdbool.h>

static bool continueLoop = true;

void foo (int a)
{
    bool doBreak = true;

    printf("a: %d",a);

    if(doBreak == true){
        continueLoop = false;
    }
    else {
        continueLoop = true;
    }
}
int main(void) {
    continueLoop = true;   // Has to be true before entering the loop
    for (int i = 0; (i <= 100) && continueLoop; i++)
    {
        foo(i);
    }
    return 0;
}

Note that in this example this is not exactly a break-instruction, but the forloop will not do another iteration. If you want to do a break you have to insert an if-condition with the variable continueLoop which leads to break:

int main(void) {
    continueLoop = true;   // Has to be true before entering the loop
    for (int i = 0; i <= 100; i++)
    {
        foo(i);
        if(!continueLoop){
            break;
        }
    }
    return 0;
}
Frodo
  • 749
  • 11
  • 23
  • 2
    Of course this is the correct answer. Asking "how do you terminate a for loop?" `for (;CONDITION;)` is like asking "how do you terminate a while loop?" `while(CONDITION)`. Note the large bold CONDITION :) – Fattie Feb 15 '16 at 23:53
4

I believe it's related to how a break statement is translated into machine code. The break statement will be translated as a unconditional branch to the label immediately following the loop or switch.

mov ECX,5
label1:
  jmp <to next instruction address>  ;break
loop label1
<next instruction>

While the call to foo() from inside the loop will result in something like

mov ECX,5
label1:
  call <foo address>
loop label1
<next instruction>

and at foo address

call <printf address>
jmp <to where?> ;break cannot translate to any address.
wizzwizz4
  • 6,140
  • 2
  • 26
  • 62
Zamrony P. Juhara
  • 5,222
  • 2
  • 24
  • 40
4

This is another idea that may or may not be feasible: keep a variable around that can turn foo into a no-op:

int broken = 0;

void foo (int a) {
    if (broken) return;

    printf("a: %d", a);
    broken = 1; // "break"
}

int main(void) {
    for (int i = 0; i <= 100; i++) {
        foo(i);
    }
    return 0;
}

This is functionally the same except for some loss of clock cycles (the function will be called, but perform only the if statement), and there is no need to change the loop. It's not threadsafe and only works the first time (but foo could reset the broken variable to 0 if called with a equal to 0, if needed).

So not great, but an idea that wasn't mentioned yet.

RemcoGerlich
  • 30,470
  • 6
  • 61
  • 79
  • *if* appropriate you can just put it in the condition of the for. `for(blah; i<101 && whatever(); ++i)` – Fattie Feb 15 '16 at 23:55
  • I assumed that the for loop was in code not under his control, that was the only way his problem made sense to me. – RemcoGerlich Feb 16 '16 at 07:41
4

In a case like this consider using a while() loop with several conditional statements chained with && instead of a for loop. Although you can alter the normal control flow using functions like setjmp and longjmp, it's pretty much considered bad practice everywhere. You shouldn't have to search too hard on this site to find out why. ( In short it's because of it's capacity to create convoluted control flow that doesn't lend itself to either debugging or human comprehension )

Also consider doing something like this:

int foo (int a) {
    if(a == someCondition) return 0;
    else {
        printf("a: %d", a);
        return 1;
    }
}

int main(void) {
    for (int i = 0; i <= 100; i++) {
        if(!foo(i)) break;
    }
    return 0;
}

In this case, the loop depends on a true value being returned from 'foo', which will break the loop if the condition inside 'foo' is not met.

Edit: I'm not explicitly against the use of goto, setjmp, longjmp etc. But I think in this case there is a much simpler and more concise solution available without resorting to these measures!

ajxs
  • 3,347
  • 2
  • 18
  • 33
4

If you cannot handle return values, can you at least add a Parameter to the function: I can imagine a solution like that:

void main (void)
{
  int a = 0;

  for (; 1 != a;)
  {
    foo(x, &a);
  } 
}

void foo( int x, int * a)
{
  if (succeeded)
  {
    /* set the break condition*/
    *a = 1;
  }
  else
  {
    *a = 0;
  }
}

It's my first post, so, please forgive me, if my formatting is messed up :)

3

Following your updated question clearly setting out the limitations, I would suggest you move the entire loop inside your function and then call a second function with a return value inside that function, e.g.

#include <stdbool.h>

bool foo (int x)
{
    return (x==14);
}

void loopFoo(int passV, int aVal, long bVal)
{
   for (int i = 0; i <= 100; ++i)
   {
       if (foo(x))
           break;
   }
}

This avoids any extreme and fragile gymnastics to get around the limitation.

Jack Aidley
  • 19,439
  • 7
  • 43
  • 70
3

Just set a global variable and check that on the loop:

#include <stdio.h>

int leave = 0;

void foo (int a) {
    printf("a: %d", a);
    leave = 1;
}

int main(void) {
    for (int i = 0; i <= 100; i++) {
        foo(i);
        if (leave)
          break;
    }
    return 0;
}
csd
  • 934
  • 5
  • 12
  • what possible reason would you have to make the variable global? `for(blah; I<101 & !leave; ++i)` – Fattie Feb 15 '16 at 23:56
  • @joe blow because the question states that the inner function declaration can't be changed to accommodate a return value. where is 'leave' declared in your one-liner? – csd Feb 18 '16 at 18:40
  • sorry, I meant as a function, like `i<101&&!leave()` Anyway look - you're quite right, the question is odd – Fattie Feb 19 '16 at 04:08
2

Consider inlining your function manually in the for loop. If this function is called in multiple loops, define it as a macro:

#define f()\
printf("a: %d", a);\
break;
Dmitry Grigoryev
  • 3,156
  • 1
  • 25
  • 53
2

You can throw an error in your function inside the loop and catch that error outside the loop.

#include <stdio.h>

void foo (int a) {
    printf("a: %d", a);
    if (a == 50)
    {
       throw a;
    }
}

int main(void) {
    try {
        for (int i = 0; i <= 100; i++) {
            foo(i);
        }
    catch(int e) {
    }
    return 0;
}
Russell Hankins
  • 1,196
  • 9
  • 17
2

This question has already been answered, but I think it is worth delving into all of the possible options to exit a loop in c++. There are basically five possibilities:

  • Using a loop condition
  • Using a break condition
  • Using a return condition
  • Using exceptions
  • Using goto

In the following, I will describe use cases for these options using c++14. However, you can do all of these in earlier versions of c++ (except maybe exceptions). To keep it short, I will omit the includes and the main function. Please comment, if you think some part needs more clarity.

1. Using a loop condition

The standard way to exit a loop is a loop condition. The loop condition is written in the middle part of a for statement, or between the parentheses of a while statement:

for(something; LOOP CONDITION; something) {
    ... 
}
while (LOOP CONDITION)
    ... 
}
do {
    ... 
} while (LOOP CONDITION);

The loop condition decides if the loop should be entered, and if the loop should be repeated. In all of the above cases, the condition has to be true, for the loop to be repeated.

As an example, if we want to output the number from 0 to 2, we could write the code using a loop and a loop condition:

for (auto i = 0; i <= 2; ++i)
    std::cout << i << '\n';
std::cout << "done";

Here the condition is i <= 2. As long as this condition evaluates to true, the loop keeps running.

An alternative implementation would be to put the condition into a variable instead:

auto condition = false;

for (auto i = 0; !condition; ++i) {
    std::cout << i << '\n';
    condition = i > 2;
}
std::cout << "done";

Checking the output for both versions, we get the desired result:

0
1
2
done

How would you use a loop condition in an real world application?

Both versions are widely used inside c++ projects. It is important to note, that the first version is more compact and therefore easier to understand. But the second version is usually used if the condition is more complex or needs several steps to be evaluated.

For example:

auto condition = false;
for (auto i = 0; !condition; ++i)
    if (is_prime(i))
        if (is_large_enough(i)) {
            key = calculate_cryptographic_key(i, data);
            if (is_good_cryptographic_key(key))
                condition = true;
        }

2. Using a break condition

Another simple way to exit a loop, is to use the break keyword. If it is used inside the loop, the execution will stop, and continue after the loop body:

for (auto i = 0; true; ++i) {
    if (i == 3)
        break;
    std::cout << i << '\n';
}
std::cout << "done";

This will output the current number, and increment it by one, until i reaches a value of 3. Here the if statement is our break condition. If the condition is true, the loop is broken (note the !) and the execution continues with the next line, printing done.

Doing the test, we indeed get the expected result:

0
1
2
done

It is important, that this will only stop the innermost loop in the code. Therefore, if you use multiple loops, it can lead to undesired behaviour:

for (auto j = 0; true; ++j)
    for (auto i = 0; true; ++i) {
        if (i == 3)
            break;
        std::cout << i << '\n';
    }
std::cout << "done";

With this code we wanted to get the same result as in the example above, but instead we get an infinite loop, because the break only stops the loop over i, and not the one over j!

Doing the test:

0
1
2
0
1
2
...

How would you use a break condition in an real world application?

Usually break is only used to skip parts of an inner loop, or to add an additional loop exit.

For example, in a function testing for prime numbers, you would use it to skip the rest of the execution, as soon as you found a case where the current number is not prime:

auto is_prime = true;
for (auto i = 0; i < p; ++i) {
    if (p%i == 0) { //p is dividable by i!
        is_prime = false;
        break; //we already know that p is not prime, therefore we do not need to test more cases!
    }

Or, if you are searching a vector of strings, you usually put the maximum size of the data in the loop head, and use an additional condition to exit the loop if you actually found the data you are searching for.

auto j = size_t(0);
for (auto i = size_t(0); i < data.size(); ++i)
    if (data[i] == "Hello") { //we found "Hello"!
        j = i;
        break; //we already found the string, no need to search any further!
    }

3. Using a return condition

The return keyword exits the current scope and returns to the calling function. Thus it can be used to exit loops, and in addition, give back a number to the caller. A common case is to use return to exit a loop (and its function) and return a result.

For example, we can rewrite the is_prime function from above:

auto inline is_prime(int p) {
    for (auto i = 0; i < p; ++i)
        if (p%i == 0) //p is dividable by i!
            return false; //we already know that p is not prime, and can skip the rest of the cases and return the result
    return true; //we didn't find any divisor before, thus p must be prime!
}

The return keyword can also be used to exit multiple loops:

auto inline data_has_match(std::vector<std::string> a, std::vector<std::string> b) {
    for (auto i = size_t(0); i < a.size(); ++i)
        for (auto j = size_t(0); j < a.size(); ++j)
            if (a[i] == b[j])
                return true; //we found a match! nothing to do here
    return false; //no match was found
}

How would you use a return condition in an real world application?

Inside smaller functions, return is often used to exit loops and directly return results. Furthermore, inside larger functions, return helps to keep the code clear and readable:

for (auto i = 0; i < data.size(); ++i) {
    //do some calculations on the data using only i and put them inside result
    if (is_match(result,test))
        return result;
    for (auto j = 0; j < i; ++j) {
        //do some calculations on the data using i and j and put them inside result
        if (is_match(result,test))
            return result;
    }
}
return 0; //we need to return something in the case that no match was found

Which is much easier to understand, than:

auto break_i_loop = false;
auto return_value = 0;
for (auto i = 0; !break_i_loop; ++i) {
    //do some calculations on the data using only i and put them inside result
    if (is_match(result,test)) { //a match was found, save the result and break the loop!
        return_value = result;
        break;
    }
    for (auto j = 0; j < i; ++j) {
        //do some calculations on the data using i and j and put them inside result
        if (is_match(result,test)) { //a match was found, save the result, break the loop, and make sure that we break the outer loop too!
            return_value = result;
            break_i_loop = true;
            break;
        }
    }
    if (!break_i_loop) //if we didn't find a match, but reached the end of the data, we need to break the outer loop
        break_i_loop = i >= data.size();
}
return return_value; //return the result

4. Using exceptions

Exceptions are a way to mark exceptional events in your code. For example, if you want to read data from a file, but for some reason the file doesn't exist! Exceptions can be used to exit loops, however the compiler usually generates a lot of boilerplate code to safely continue the program if the exception is handled. Therefore exceptions shouldn't be used to return values, because it is very inefficient.

How would you use an exception in an real world application?

Exceptions are used to handle truly exceptional cases. For example, if we want to calculate the inverse of our data, it might happen that we try to divide by zero. However this is not helpful in our calculation, therefore we write:

auto inline inverse_data(std::vector<int>& data) {
    for (auto i = size_t(0); i < data.size(); ++i)
        if (data[i] == 0)
            throw std::string("Division by zero on element ") + std::to_string(i) + "!";
        else
            data[i] = 1 / data[i];
}

We can the handle this exception inside the calling function:

while (true)
    try {
        auto data = get_user_input();
        inverse = inverse_data(data);
        break;
    }
    catch (...) {
        std::cout << "Please do not put zeros into the data!";
    }

If data contains zero, then inverse_data will throw an exception, the break is never executed, and the user has to input data again.

There are even more advance options for this kind of error handling, with additional error types, ..., but this is a topic for another day.

** What you should never do! **

As mentioned before, exceptions can produce a significant runtime overhead. Therefore, they should only be used in truly exceptional cases. Although it is possible to write the following function, please don't!

auto inline next_prime(int start) {
    auto p = start;
    try {
        for (auto i = start; true; ++i)
            if (is_prime(i)) {
                p = i;
                throw;
            }
   }
   catch (...) {}
   return p;
 }

5. Using goto

The goto keyword is hated by most programmers, because it makes code harder to read, and it can have unintended side effects. However, it can be used to exit (multiple) loops:

for (auto j = 0; true; ++j)
    for (auto i = 0; true; ++i) {
        if (i == 3)
            goto endloop;
        std::cout << i << '\n';
    }
endloop:
std::cout << "done";

This loop will end (not like the loop in part 2), and output:

0
1
2
done

How would you use a goto in an real world application?

In 99.9% of the cases there is no need to use the goto keyword. The only exceptions are embedded systems, like an Arduino, or very high performance code. If you are working with one of these two, you might want to use goto to produce faster or more efficient code. However, for the everyday programmer, the downsides are much bigger than the gains of using goto.

Even if you think your case is one of the 0.1%, you need to check if the goto actually improves your execution. More often than not, using a break or return condition is faster, because the compiler has a harder time understanding code containing goto.

jan.sende
  • 750
  • 6
  • 23