74

I know everyone hates gotos. In my code, for reasons I have considered and am comfortable with, they provide an effective solution (ie I'm not looking for "don't do that" as an answer, I understand your reservations, and understand why I am using them anyway).

So far they have been fantastic, but I want to expand the functionality in such a way that requires me to essentially be able to store pointers to the labels, then go to them later.

If this code worked, it would represent the type of functionality that I need. But it doesn't work, and 30 min of googling hasn't revealed anything. Does anyone have any ideas?

int main (void)
{
  int i=1;
  void* the_label_pointer;

  the_label:

  the_label_pointer = &the_label;

  if( i-- )
    goto *the_label_pointer;

  return 0;
}
Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
Joshua Cheek
  • 30,436
  • 16
  • 74
  • 83
  • Can you explain why you need to store the labels in pointers? – Ahmed Nov 22 '09 at 06:23
  • 6
    I am implementing a finite state machine, based off of the answer by Remo.D in this post http://stackoverflow.com/questions/132241/ My version has evolved to be considerably more involved than this, but this represents the basic structure. It has been effective so far, but I would like to make available to the states some context where they can access the calling state and current state through either some variables that are set on state transitions, or through a callback or something. – Joshua Cheek Nov 22 '09 at 06:36
  • 1
    Duplicate of http://stackoverflow.com/questions/938518/c-c-goto – qrdl Nov 22 '09 at 07:52

14 Answers14

76

The C and C++ standards do not support this feature.

However, the GNU Compiler Collection (GCC) includes a non-standard extension for doing this, as described in the Labels as Values section of the Using the GNU Compiler Collection manual.

Essentially, they have added a special unary operator && that reports the address of the label as type void *. See the article for details. With that extension, you could just use && instead of & in your example, and it would work on GCC.

P.S. I know you don’t want me to say it, but I’ll say it anyway… DON’T DO THAT!!!

ib.
  • 27,830
  • 11
  • 80
  • 100
Michael Aaron Safyan
  • 93,612
  • 16
  • 138
  • 200
  • 27
    goto label address is great for writing an interpreter. – Justin Dennahower Nov 05 '13 at 17:22
  • 4
    I'd like to know why in the world they used double ampersands (logical and), when the existing get-the-address-of-an-identifier '&' would have made the most sense. The only reason why I can figure is that label identifiers appear to exist in a parallel but separate scope as variable identifiers, and thus there could be ambiguity between getting the address of a label vs variable if both were named the same (arguably though that's just bad practice to declare an int foo and foo: in the same function). If this ever gets into the standard, I'd hope for '&', not '&&'. – Dwayne Robinson Nov 02 '14 at 12:25
  • 2
    Totally do it. If you are writing an interpreter loop that's the way to do it. – Kariddi Sep 28 '17 at 17:42
  • @JustinDennahower: CPython has been using it for this purpose for years (and it nets a 15-20% speedup by replacing a single unpredictable branch on a `switch` with more predictable per-opcode branches; each opcode has its own pattern of what opcodes follow it, so per-opcode branches mean per-opcode branch prediction, which is much more reliable). It's all hidden behind macros, so when it's compiled on non-GCC compilers, it still works (via a standard infinite loop + `switch` construct), it's just slower. – ShadowRanger Sep 08 '22 at 19:46
23

I know the feeling then everybody says it shouldn't be done; it just has to be done. In GNU C use &&the_label; to take the address of a label. (https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html) The syntax you guessed, goto *ptr on a void*, is actually what GNU C uses.

Or if you want to use inline assembly for some reason, here's how to do it with GNU C asm goto

// unsafe: this needs to use  asm goto so the compiler knows
// execution might not come out the other side
#define unsafe_jumpto(a) asm("jmp *%0"::"r"(a):)

// target pointer, possible targets
#define jumpto(a, ...) asm goto("jmp *%0" : : "r"(a) : : __VA_ARGS__)

int main (void)
{
  int i=1;
  void* the_label_pointer;

  the_label:
  the_label_pointer = &&the_label;

label2:

  if( i-- )
    jumpto(the_label_pointer, the_label, label2, label3);

label3:
  return 0;
}

The list of labels must include every possible value for the_label_pointer.

The macro expansion will be something like

asm goto("jmp *%0" : : "ri"(the_label_pointer) : : the_label, label2, label3);

This compiles with gcc 4.5 and later, and with the latest clang which just got asm goto support some time after clang 8.0. https://godbolt.org/z/BzhckE. The resulting asm looks like this for GCC9.1, which optimized away the "loop" of i=i / i-- and just put the the_label after the jumpto. So it still runs exactly once, like in the C source.

# gcc9.1 -O3 -fpie
main:
    leaq    .L2(%rip), %rax     # ptr = &&label
    jmp *%rax                     # from inline asm
.L2:
    xorl    %eax, %eax          # return 0
    ret

But clang didn't do that optimization and still has the loop:

# clang -O3 -fpie
main:
    movl    $1, %eax
    leaq    .Ltmp1(%rip), %rcx
.Ltmp1:                                 # Block address taken
    subl    $1, %eax
    jb      .LBB0_4                  # jump over the JMP if i was < 1 (unsigned) before SUB.  i.e. skip the backwards jump if i wrapped
    jmpq    *%rcx                   # from inline asm
.LBB0_4:
    xorl    %eax, %eax              # return 0
    retq

The label address operator && will only work with gcc. And obviously the jumpto assembly macro needs to be implemented specifically for each processor (this one works with both 32 and 64 bit x86).

Also keep in mind that (without asm goto) there would be no guarantee that the state of the stack is the same at two different points in the same function. And at least with some optimization turned on it's possible that the compiler assumes some registers to contain some value at the point after the label. These kind of things can easily get screwed up then doing crazy shit the compiler doesn't expect. Be sure to proof read the compiled code.

These are why asm goto is necessary to make it safe by letting the compiler know where you will / might jump, getting consistent code-gen for the jump and the destination.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Fabel
  • 1,711
  • 14
  • 36
  • 1
    Can't you just `lea eax, label; mov label_ptr, eax` (intel syntax), to store the pointer in a variable? – Calmarius Oct 16 '14 at 08:45
  • 1
    There is no doubt it can be implemented in assembly (which maybe could be considered better in this case). One benefit of implementing it in C is that the compiler do some optimizations. – Fabel Oct 17 '14 at 16:51
  • One of the best answers here, thanks very much, helped me out in a reverse engineering project. – kungfooman Mar 30 '16 at 05:03
  • I think this is actually a very useful feature for border cases where you can't do infinite recursion because it would blow your stackframe and you need to track context without branching everytime before you jump. Sad that its only implemented in gcc :( – glades Jun 15 '22 at 16:23
  • @glades The same thing can be achieved with a switch statement, since the labels need to belong to a predefined set anyway. If you place all functions in one switch you can both call and goto any any label in perfectly portable C. Yes, case labels can go anywere iin the code, even inside if blocks of whatever. (This is true for the answer rewritten by Peter Cordes, my original answer allowed jumping between code in different object files in a less limited and less secure way.) – Fabel Jun 16 '22 at 18:16
  • @Fabel I'm considering that but how would you do it if your code jumps into a label from multiple places and then has to return to the section it jumped from? It can't be a function for some reason, how would you do that with a switch statement? – glades Jun 16 '22 at 22:05
  • @glades Let the function have two arguments: the variable used in the switch statement and a pointer to a struct containing the actual arguments for the "function" (which is just one of the cases). This way the single function can be called recursively just like if it's a different function. If the "functions" need to return different kinds of values the struct can be used for that too. Of course each "function" can use a different struct (or a union if preferred). It's perfectly safe since the caller and the "function" agree on it. – Fabel Jun 24 '22 at 04:50
  • @Fabel: That would be a possibility if I could use functions. The problem is that within switch statements I need to recursively call another code section that might itself call this code section again. As nobody knows how many times this will happen I run the risk of a stack overflow. – glades Jun 27 '22 at 06:45
  • @glades You can both jump to another case label in the switch statement (as a common state machine), but then not return to the previous state if you haven't saved it in some way. Or you can call the single function recursively and be able to return (as with a normal function) but risk a stack overflow. Those methods can be mixed safely. And you can store a previous state in any way you like (like with a pointer to a label). I fail to see any limitations, except for the finite set of states/functions/labels (you can not add additional "states" in another object file and jump between). – Fabel Jun 27 '22 at 17:47
15

In the very very very old version of C language (think of the time dinosaurs roamed the Earth), known as "C Reference Manual" version (which refers to a document written by Dennis Ritchie), labels formally had type "array of int" (strange, but true), meaning that you could declare an int * variable

int *target;

and assign the address of label to that variable

target = label; /* where `label` is some label */

Later you could use that variable as the operand of goto statement

goto target; /* jumps to label `label` */

However, in ANSI C this feature was thrown out. In the standard modern C you cannot take address of a label and you cannot do "parametrized" goto. This behavior is supposed to be simulated with switch statements, pointers-to-functions and other methods etc. Actually, even "C Reference Manual" itself said that "Label variables are a bad idea in general; the switch statement makes them almost always unnecessary" (see "14.4 Labels").

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
  • Oh interesting, so the GNU C [labels as values](https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html) extension is just reintroducing a historical C feature, with somewhat different syntax (`void *target = &&label` and `goto *target`). – Peter Cordes Apr 20 '22 at 03:09
14

You can do something similar with setjmp/longjmp.

int main (void)
{
    jmp_buf buf;
    int i=1;

    // this acts sort of like a dynamic label
    setjmp(buf);

    if( i-- )
        // and this effectively does a goto to the dynamic label
        longjmp(buf, 1);

    return 0;
}
R Samuel Klatchko
  • 74,869
  • 16
  • 134
  • 187
  • 14
    Just a caution that setjmp/longjmp can be slow, since they save and restore much more than just the program counter. – RickNZ Nov 22 '09 at 07:00
  • 2
    This does not work: depending on whether `i` is stored in a register or *on the stack*, its original value (`1`) will be restored by `longjmp()` or not, hence potentially causing an infinite loop. – chqrlie Aug 09 '19 at 06:29
13

According to the C99 standard, § 6.8.6, the syntax for a goto is:

    goto identifier ;

So, even if you could take the address of a label, you couldn't use it with goto.

You could combine a goto with a switch, which is like a computed goto, for a similar effect:

int foo() {
    static int i=0;
    return i++;
}

int main(void) {
    enum {
        skip=-1,
        run,
        jump,
        scamper
    } label = skip; 

#define STATE(lbl) case lbl: puts(#lbl); break
    computeGoto:
    switch (label) {
    case skip: break;
        STATE(run);
        STATE(jump);
        STATE(scamper);
    default:
        printf("Unknown state: %d\n", label);
        exit(0);
    }
#undef STATE
    label = foo();
    goto computeGoto;
}

If you use this for anything other than an obfuscated C contest, I will hunt you down and hurt you.

outis
  • 75,655
  • 22
  • 151
  • 221
  • What is the difference between puts(#lbl) and puts(lbl)? – Ahmed Nov 22 '09 at 07:31
  • 1
    The `#` is the preprocessor stringizing operator (http://en.wikipedia.org/wiki/C_preprocessor#Quoting_macro_arguments). It converts identifiers into strings. `puts(lbl)` won't compile because `lbl` isn't a `char *`. – outis Nov 22 '09 at 07:54
  • Rather, it will compile with warnings and crash if you run it. – outis Nov 23 '09 at 01:33
  • 4
    +1 for evil thinking and use of macros above and beyond the call of duty. – EvilTeach Apr 02 '10 at 22:20
10

The switch ... case statement is essentially a computed goto. A good example of how it works is the bizarre hack known as Duff's Device:

send(to, from, count)
register short *to, *from;
register count;
{
    register n=(count+7)/8;
    switch(count%8){
    case 0: do{ *to = *from++;
    case 7:     *to = *from++;
    case 6:     *to = *from++;
    case 5:     *to = *from++;
    case 4:     *to = *from++;
    case 3:     *to = *from++;
    case 2:     *to = *from++;
    case 1:     *to = *from++;
        }while(--n>0);
    }
}

You can't do a goto from an arbitrary location using this technique, but you can wrap your entire function in a switch statement based on a variable, then set that variable indicating where you want to go, and goto that switch statement.

int main () {
  int label = 0;
  dispatch: switch (label) {
  case 0:
    label = some_computation();
    goto dispatch;
  case 1:
    label = another_computation();
    goto dispatch;
  case 2:
    return 0;
  }
}

Of course, if you do this a lot, you'd want to write some macros to wrap it.

This technique, along with some convenience macros, can even be used to implement coroutines in C.

Brian Campbell
  • 322,767
  • 57
  • 360
  • 340
  • 3
    There is no guarantee that the `switch/case` will be implemented as a computed `goto`. Quite often it is compiled as if it was a series of `if/else if/else if/...` and the generated assembly will test for each value rather than compute a single address to jump to. – sam hocevar Dec 08 '11 at 09:46
  • 6
    @SamHocevar Sure, you can't depend on how it will be implemented (though cases like this, in which you are using a small range with no holes, are much more likely to be optimized this way). But despite whether the optimization is applied, it is semantically equivalent to a `goto` that is conditional on the value that you pass in, due to the fall-through behavior. The behavior is the same, the implementation only effects the performance. And it seems to be a relevant answer to the OP's question, since he's looking to build a state machine using `goto`s, for which `switch` would do the trick. – Brian Campbell Dec 08 '11 at 23:00
  • 1
    Your implementation of *Duff's device* is broken: the `case 0:` should be moved to the end of the `do` body and followed by an empty statement. As coded, sending 0 bytes incorrectly sends 8 bytes. – chqrlie Aug 09 '19 at 06:22
  • @chqrlie I guess OP copied the example from wikipedia where it's stated that "This code assumes that initial count > 0." On another note I don't think this kind of loop unrolling makes sense now as the compiler will unroll the loop if it makes sense and even if it doesn't ALU pipelining will forward calculate the exit conditions of the loop for many iterations so that this kind of manual trickery is irrelevant on modern processors. – glades Jun 27 '22 at 06:59
8

I will note that the functionally described here (including && in gcc) is IDEAL for implementing a Forth language interpreter in C. That blows all the "don't do that" arguments out of the water - the fit between that functionality and the way Forth's inner interpreter works is too good to ignore.

Kip Ingram
  • 81
  • 1
  • 1
4

Use function pointers and a while loop. Don't make a piece of code someone else will have to regret fixing for you.

I presume you're trying to change the address of the label somehow externally. Function pointers will work.

4
#include <stdio.h>

int main(void) {

  void *fns[3] = {&&one, &&two, &&three};   
  char p;

  p = -1;

  goto start; end:   return 0;     
  start:   p++;   
  goto *fns[p];
  one:  printf("hello ");  
  goto start;  
  two:  printf("World. \n");  
  goto start;
  three:  goto end;
}
Unheilig
  • 16,196
  • 193
  • 68
  • 98
RUE_MOHR
  • 41
  • 1
  • 3
    Note that this is not standard C++, rather an extension provided by the GNU C++ compiler (see https://gcc.gnu.org/onlinedocs/gcc-6.2.0/gcc/Labels-as-Values.html#Labels-as-Values). Clang also has this extension, while Visual C++ does not (see http://stackoverflow.com/questions/6421433/address-of-labels-msvc). – Pietro Braione Oct 03 '16 at 08:09
  • @PietroBraione It should be in the C standard, it makes sense in some edge cases when you don't want to dive down to assembly just for doing that and for portability reasons. – glades Jun 27 '22 at 07:29
3

The only officially supported thing that you can do with a label in C is goto it. As you've noticed, you can't take the address of it or store it in a variable or anything else. So instead of saying "don't do that", I'm going to say "you can't do that".

Looks like you will have to find a different solution. Perhaps assembly language, if this is performance-critical?

Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
1

Read this: setjmp.h - Wikipedia As previously said it is possible with setjmp/longjmp with which you can store a jumppoint in a variable and jump back later.

icefex
  • 523
  • 1
  • 7
  • 14
1

You can assign label to variable using &&. Here is your modified code.


int main (void)
{
  int i=1;
  void* the_label_pointer = &&the_label;

  the_label:


  if( i-- )
    goto *the_label_pointer;


  return 0;
}
Olter
  • 1,129
  • 1
  • 21
  • 40
Mayank
  • 11
  • 1
0

According to this thread, label points are not a standard, so whether they work or not would depend on the compiler you're using.

Kaleb Brasee
  • 51,193
  • 8
  • 108
  • 113
0

You can do something like Fortran's computed goto with pointers to functions.

// global variables up here

void c1(){ // chunk of code

}

void c2(){ // chunk of code

}

void c3(){
// chunk of code

}

void (*goTo[3])(void) = {c1, c2, c3};

// then
int x = 0;

goTo[x++] ();

goTo[x++] ();

goTo[x++] ();

Or use try/catch

#include <iostream>
template<int N> struct GoTo{};
template<int N> using Label = GoTo<N>;

int main() {
    int x;
    std::cin >> x;
    try {
        if(x==1) throw GoTo<1>{};
        if(x==2) throw GoTo<2>{};
        if(x==3) throw GoTo<3>{};
        throw x;
    }
    catch(Label<1>){
        std::cout << 1;
    }
    catch(Label<2>){
        std::cout << 2;
    }
    catch(Label<3>){
        std::cout << 3;
    }
    catch(int x){
        std::cout << x;
    }
}
QuentinUK
  • 2,997
  • 21
  • 20
  • The benefit of an address label is also having access to the stack, not just the (faster) function call. But indeed might be one of the few solutions for MSVC – HelloWorld Oct 04 '19 at 02:45
  • @HelloWorld Why is a function call faster? – glades Jul 22 '22 at 18:39