1

Here's a trivial program, compiled on Intel Mac using C++11

#include <iostream>

void show() //noexcept
{
   std::cout << "Hello, World!\n";
}

int main(int argc, const char * argv[]) 
{
   show();
   return 0;
}

I've been examining the difference in code generation when 'noexcept' is used or not used both in debug mode and release (optimized) mode.

In debug mode, without the noexcept qualifier, I get the following instructions for the show function:

0x100003280 <+0>:  pushq  %rbp
0x100003281 <+1>:  movq   %rsp, %rbp
0x100003284 <+4>:  movq   0xd75(%rip), %rdi         ; (void *)0x00007fff96d36760: std::__1::cout
0x10000328b <+11>: leaq   0xcc2(%rip), %rsi         ; "Hello, World!\n"
0x100003292 <+18>: callq  0x100003e2c               ; symbol stub for: std::__1::basic_ostream<char.....
0x100003297 <+23>: popq   %rbp
0x100003298 <+24>: retq   

Very obvious - create stack frame, set up the call to cout, finish with stack frame and return.

However, if I uncomment 'noexcept' then I get the following:

0x100003240 <+0>:  pushq  %rbp
0x100003241 <+1>:  movq   %rsp, %rbp
0x100003244 <+4>:  subq   $0x10, %rsp
0x100003248 <+8>:  movq   0xdb1(%rip), %rdi         ; (void *)0x00007fff96d36760: std::__1::cout
0x10000324f <+15>: leaq   0xcee(%rip), %rsi         ; "Hello, World!\n"
0x100003256 <+22>: callq  0x100003e0c               ; symbol stub for: std::__1::basic_ostream<char....
0x10000325b <+27>: jmp    0x100003260               ; <+32> at main.cpp:14:1
0x100003260 <+32>: addq   $0x10, %rsp
0x100003264 <+36>: popq   %rbp

Note the extra subtraction (third line) and the reverse addition (second last line). More curious is the jmp instruction which just jumps to the next line.

Questions

  1. Why is extra code being generated with noexcept - I thought there would be less if you're guaranteeing that the stack doesn't need to be unwound
  2. What's the purpose of that jmp instruction?

Thanks in advance

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
David
  • 5,991
  • 5
  • 33
  • 39
  • Does any of this still happen with optimization enabled? Debug builds can have useless instructions left in. Also, looking at the compiler-generated asm source is often more informative than disassembly, you can look for metadata directives and labels. https://godbolt.org/z/qTGfYY6vj – Peter Cordes Jul 07 '22 at 02:48
  • Yeah, clang targeting Linux reproduces this with the default `-O0`, but it goes away at `-O1` even without optimizing the tailcall into a jmp. https://godbolt.org/z/v74b8zfPj. So just another case of internal complexity not getting optimized away at -O0. A bit like [Why does clang produce inefficient asm with -O0 (for this simple floating point sum)?](https://stackoverflow.com/q/53366394) but that's mostly about the not keeping things in registers across C statements. – Peter Cordes Jul 07 '22 at 02:53
  • Performance and instructions are NEVER to be tested under Debug or none optimized build (like -O0). the behavior is so drastically different, it's actually a different beast altogether. – ShaulF Jul 13 '22 at 08:00

0 Answers0