6

Having this piece of code:

int main(){
        int x = 13;
        goto f;
        asm __volatile__ (".byte 0xff");
        f:
        return 0;
}

I don't understand why g++ optimize it and does not include the opcode(in the disassembly):

# 5 "q.c" 1
        .byte 0xff
# 0 "" 2

even if I do compile without any optimization at all: g++ -g -O0 -S q.c. I tried with g++ -g and g++ -O0 alone because I read it might not be compatible in some situations.

If I comment the goto f; line it will insert the opcode.

        .file   "q.c"
        .text
        .globl  main
        .type   main, @function
main:
.LFB0:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register 6
        movl    $13, -4(%rbp)
#APP
# 5 "q.c" 1<<<<<<<<<<
        .byte 0xff<<<<<<<<<<
# 0 "" 2<<<<<<<<<<
.L2:
#NO_APP
        movl    $0, %eax
        popq    %rbp
        .cfi_def_cfa 7, 8
        ret
        .cfi_endproc
.LFE0:
        .size   main, .-main
        .ident  "GCC: (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3"
        .section        .note.GNU-stack,"",@progbits

The question is:

Does g++ not include a piece of code if it is not going to be used even if I compile with no optimization at all?

I want to know why it does not include that piece of code not finding another solutions.

Update

I read in the comments that : It is bad code. But what if I want to have it? What if I want to have a piece of code injected right there that does not do anything by itself? Does g++ restricts me ?

Update 2

Because it's dead code is not an explanation. I compiled this code on Windows VS2012

int main() {
    std::cout << "something ";
    goto foo;
     __asm _emit 0xff
     __asm _emit 0xfe;
    foo :
    std::cout << "other thing";
}

Guess what? When it is compiled with Debug configuration the asm code it is included in the binary:

.text:00414ECE                 push    offset aSomething ; "something "
.text:00414ED3                 mov     eax, ds:__imp_?cout@std@@3V?$basic_ostream@DU?$char_traits@D@std@@@1@A ; std::basic_ostream<char,std::char_traits<char>> std::cout
.text:00414ED8                 push    eax
.text:00414ED9                 call    loc_41129E
.text:00414EDE                 add     esp, 8
.text:00414EE1                 jmp     short loc_414EE7
.text:00414EE1 ; ---------------------------------------------------------------------------
.text:00414EE3                 db 0EBh ; d
.text:00414EE4                 db    2
.text:00414EE5                 db 0FFh<<<<<<<<<<<<<<
.text:00414EE6                 db 0FEh ; ¦<<<<<<<<<<<<
.text:00414EE7 ; ---------------------------------------------------------------------------
.text:00414EE7
.text:00414EE7 loc_414EE7:                             ; CODE XREF: main+31j
.text:00414EE7                 push    offset aOtherThing ; "other thing"
.text:00414EEC                 mov     eax, ds:__imp_?cout@std@@3V?$basic_ostream@DU?$char_traits@D@std@@@1@A ; std::basic_ostream<char,std::char_traits<char>> std::cout
.text:00414EF1                 push    eax
.text:00414EF2                 call    loc_41129E
.text:00414EF7                 add     esp, 8
.text:00414EFA                 jmp     short loc_414EFE
Emil Condrea
  • 9,705
  • 7
  • 33
  • 52

3 Answers3

4

Interestingly, all the compilers I could try (gcc, llvm-gcc, icc, clang) optimize out the code.

As a workaround, you can include the goto in the asm itself:

__asm__ __volatile__ (
    "jmp 1f\n\t"
    ".byte 0xff\n\t"
    "1:");

This is unfortunately architecture specific, while your original code might not be. For that case, the best I could think of is:

volatile int false = 0;
if (false) __asm__ __volatile__ (".byte 0xff");

Of course this incurs a runtime load and test. Both of these work even with optimizations enabled.

Jester
  • 56,577
  • 4
  • 81
  • 125
2

What you have there is unreachable code. Removing it is not an optimisation, it's just saving space.

The compiler is obliged to deliver code that executes the function described by the source code. The asm definition is implementation defined -- the compiler can do whatever it wants, and you can't complain. It does what it does.

The obvious suggestion is to put a label on the code to give the compiler at least one way to get there, even if it never does.

Edit: Omitting unreachable code is not on any of the lists of compiler optimisations I checked, but at some point this becomes a matter of definitions. A compiler writer could easily arrange to omit code after an unconditional branch without any sophisticated block analysis. I guess you'd have to ask them.

david.pfx
  • 10,520
  • 3
  • 30
  • 63
  • Labels are function-local, compiler will easily see through it. – keltar Mar 20 '14 at 13:35
  • @david.pfx I updated my question. The unreachable code **it is** included in the binary when compiling with VS2012 – Emil Condrea Mar 20 '14 at 13:43
  • “Saving space” is an optimisation. –  Mar 20 '14 at 18:07
  • @rightfold: No, saving space is just one of the possible goals for optimisation. Most lists of compiler optimisations I checked do not include omission of unreachable code (dead code is different). But I might reword my answer. – david.pfx Mar 20 '14 at 22:07
  • @EmilCondrea: just another example of 'implementation-defined behaviour'. You gets what you gets. – david.pfx Mar 20 '14 at 22:15
  • 1
    @keltar: You can __always__ fool an optimiser. `if (tm.tm_mon > 12) goto ...` – david.pfx Mar 20 '14 at 22:22
2

Compilers typically translate code into an internal graph of so-called basic blocks. A basic block is a sequence of instructions with no jumps except for the last instruction of the basic block.

The code you gave would thus yield basic blocks like this:

Basic blocks.

The compiler then starts traversing the basic block graph at the first basic block, but never visits the second basic block because there is no other basic block that jumps to it (i.e. the second basic block has no incoming edges in the graph), hence it’s not converted to assembly. In other words, omitting the assembly generation for the second basic block is not deliberate but merely a side-effect of the graph traversal.

Microsoft’s compiler may work differently than other compilers in this regard. For example, it may translate all basic blocks to assembly, concatenate the results and then resolve jumps rather than the other way around.