-1

C++ disallows "goto-ing over a definition:"

goto jumpover;
int something = 3;
jumpover:
std::cout << something << std::endl;

This will raise an error as expected, because "something" won't be declared(or defined).

However, I jumped over using assembly code:

#include<iostream>
using namespace std;
int main(){
    asm("\njmp tag\n");
    int ptr=9000;//jumped over
    cout << "Ran" << endl;
    asm("\ntag:\n");
    cout << ptr << endl;
    return 0;
}

It printed 9000, although the int ptr=9000;//jumped over line is NOT executed, because the program did not print Ran. I expected it would cause a memory corruption/undefined value when ptr is used, because the memory isn't allocated(although the compiler thinks it is,because it does not understand ASM). How can it know ptr is 9000?

Does that mean ptr is created and assigned at the start of main()(therefore not skipped,due to some optimizations or whatever) or some other reason?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • You need to check the generated machine code to see what happens. For instance, in this live demo with enabled optimizations, the compiler effectively transformed the source code into `cout << 9000 << endl;`: https://godbolt.org/z/6sn3K9j7d. With disabled optimizations, this did not happen and 9000 was not printed. – Daniel Langr Feb 23 '22 at 11:24
  • 1
    I get different results depending on optimization. [`-O0` example](https://godbolt.org/z/d6sPs9Y8d) (the variable is _not_ initialized with 9000) and [`-O3` example](https://godbolt.org/z/KE1bheshW) (the whole program is optimized away and the ending `ptr == 9000` is `true`) – Ted Lyngmo Feb 23 '22 at 11:26
  • Space for local variables is reserved on function entry, not at the point of declaration, and a decent compiler will eliminate the `ptr` variable entirely. There is no one-to-one correspondence between source and execution, either in time or space. (Also, you can't expect undefined behaviour to be anything in particular.) – molbdnilo Feb 23 '22 at 11:57
  • @molbdnilo: I added the [undefined-behavior] tag; I'm not sure the querent realized this was UB. Hmm, I only just noticed they said they knew the compiler didn't understand asm, and were talking about space allocated or not; updated my answer. – Peter Cordes Feb 23 '22 at 12:01
  • Re: "won't be declared (or defined)" for the first code snippet -- the issue is actually that it won't be **initialized**. `int something;` would be okay, since it doesn't initialize `something`. The name would be in scope from the point of definition to the end of the block. – Pete Becker Feb 23 '22 at 14:40

1 Answers1

2

Jumping between asm() statements is not supported by GCC;
your code has undefined behaviour.
Literally anything is allowed to happen.

There's no __builtin_unreachable() after it, and you didn't even use asm goto("" ::: : "label") (GCC manual) to tell it about a C label the asm statement might or might not jump to.

Whatever happens in practice with different versions of gcc/clang and different optimization levels when you do that is a coincidence / implementation detail / result of whatever the optimizer actually did.

For example, with optimization enabled it would do constant-propagation assuming that the int ptr=9000; statement would be reached, because it's allowed to assume that execution comes out the end of the first asm statement.

You'd have to look at the compiler's full asm output to see what actually happened. e.g. https://godbolt.org/z/MbGhEnK3b shows GCC -O0 and -O2. With -O0 you do indeed get it reading uninitialized stack space since it jumps over a mov DWORD PTR [rbp-4], 9000, and with -O2 you get constant-propagation: mov esi, 9000 before the call std::basic_ostream<char,... operator <<(int) overload.

because the memory isn't allocated

Space for it actually is allocated in the function prologue; compilers don't generate code to move the stack pointer every time they encounter a declaration inside a scope. They allocate space once at the start of a function. Even the one-pass Tiny C Compiler works this way, not using a separate push to alloc+init separate int vars. (This is actually a missed optimization in some cases when push would be useful to alloc + init in one instruction: What C/C++ compiler can use push pop instructions for creating local variables, instead of just increasing esp once?)


Even moreso than most other kinds of C undefined behaviour, this is not something the compiler can actually detect at run-time to warn you about. asm statements just insert text into GCC's asm output which is fed to the assembler. You need to accurately describe to the compiler what the asm does (using constraints and things like asm goto) to give the compiler enough information to generate correct code around your asm statement.

GCC does not parse the instructions in the asm template, it just copies it directly to the asm output. (Or for Extended asm, substitutes the %0, %1 etc. operands with text generated according to the operand constraints.)

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • I wish more people would recognize the distinction between constructs which the Standard says that implementations aren't *required* to define (but may--and often should--define anyway) and those which an implementation says that it doesn't define. If a program needs to do things not anticipated by the Standard, the right way to accomplish that is often to actions which aren't defined by the Standard but will be defined by implementations suitable for such tasks. If an implementation says it will make no effort to process a construct meaningfully, however, it should be taken at its word. – supercat Feb 23 '22 at 20:54