-3

I was surprised by this, today:

#include <stdlib.h> // for the 'exit' call

int foo() {
    // return 0;
}

int main() {
    int res = foo();
    exit(res);
}

I know it's not good form to forget to return the expected integer value in foo; but would you expect this code to segfault?

Here's what happens with GCC7.5:

(thanassis)$ g++ -O3 -Wall  a.cc
a.cc: In function ‘int foo()’:
a.cc:5:5: warning: no return statement in function returning non-void [-Wreturn-type]
     }
     ^

(thanassis)$ ./a.out 

(thanassis)$ gdb  ./a.out
GNU gdb (Ubuntu 10.2-0ubuntu1~18.04~2) 10.2
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./a.out...
(No debugging symbols found in ./a.out)

(gdb) run
Starting program: /home/thanassis/a.out 
[Inferior 1 (process 24749) exited normally]

(gdb) quit

No problems. All good.

Yes, the fact that foo neglected to set the returned valued, means that the register picked for the task by the ABI (EAX) will have garbage. Whatever.

Now look at what happens with GCC11:

(thanassis)$ g++ -O3 -Wall ./a.cc 
./a.cc: In function ‘int foo()’:
./a.cc:5:5: warning: no return statement in function returning non-void [-Wreturn-type]
    5 |     }
      |     ^

(thanassis)$ ./a.out 
Segmentation fault (core dumped)

(thanassis)$ gdb  ./a.out
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./a.out...
(No debugging symbols found in ./a.out)
(gdb) run
Starting program: /home/thanassis/a.out 

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7ddbfaa in __libc_start_main (main=0x555555555044 <main>, argc=-136462205, argv=0x7fffff7ff0b0, init=0x555555555140 <__libc_csu_init>, 
    fini=0x5555555551b0 <__libc_csu_fini>, rtld_fini=0x7fffffffe0e8, stack_end=0x7fffff7ff0a8) at ../csu/libc-start.c:141
141     ../csu/libc-start.c: No such file or directory.
(gdb) bt
#0  0x00007ffff7ddbfaa in __libc_start_main (main=0x555555555044 <main>, argc=-136462205, argv=0x7fffff7ff0b0, init=0x555555555140 <__libc_csu_init>, 
    fini=0x5555555551b0 <__libc_csu_fini>, rtld_fini=0x7fffffffe0e8, stack_end=0x7fffff7ff0a8) at ../csu/libc-start.c:141
#1  0x000055555555507e in _start ()
(gdb) 

Now this, I did not expect.

For one, the stack frames seem to be messed up - where's the main stack frame?

Looking at the output of objdump for main...

$ objdump -d -S ./a.out 
...
Disassembly of section .text:

0000000000001040 <_Z3foov>:
#include <stdlib.h>

    int foo() {
    1040:       f3 0f 1e fa             endbr64 

0000000000001044 <main>:
        // return 0;
    }

    int main() {
    1044:       f3 0f 1e fa             endbr64 
    1048:       0f 1f 84 00 00 00 00    nopl   0x0(%rax,%rax,1)
    104f:       00 

0000000000001050 <_start>:
    1050:       f3 0f 1e fa             endbr64 
    1054:       31 ed                   xor    %ebp,%ebp
    1056:       49 89 d1                mov    %rdx,%r9
    1059:       5e                      pop    %rsi
    105a:       48 89 e2                mov    %rsp,%rdx
    105d:       48 83 e4 f0             and    $0xfffffffffffffff0,%rsp

...it looks like GCC decided to "merge" stack frames?!

This looks like a compiler bug to me. Note that ignoring older compilers, it also doesn't manifest with optimisation levels -O0 and -O1 - but it does after -O2.

Again, to be clear: I know it's bad form, and I do use -Wall and -Wextra - so I did fix this in my code. But I thought I'd share this here, since I just never expected an int-returning function not returning an int to create a segfault (due to the compiler creating code that misses a stack frame).

UPDATE: Note also that compiling with gcc and not g++ creates normal code. The issue only manifests when compiling the code with a C++ compiler.

ttsiodras
  • 10,602
  • 6
  • 55
  • 71
  • 16
    Your code has undefined behavior, all results are "correct". If you say you are going to return a value, the compiler expects that you will. Breaking that contract breaks the behavior. – NathanOliver Mar 01 '23 at 13:38
  • 3
    By definition, compiler is allowed to absolutely anything when UB is involved, so it can't be a bug. – Yksisarvinen Mar 01 '23 at 13:39
  • 3
    _"...Flowing off the end of a value-returning function (except main) without a return statement is __undefined behavior__...."_ https://en.cppreference.com/w/cpp/language/return – Richard Critten Mar 01 '23 at 13:40
  • Not a bug. Add a `-Werror=return-type` to your C++ compilations. (if you hadn't used the return value, this would be somewhat legal C) – teapot418 Mar 01 '23 at 13:41
  • _"Is this a GCC11 bug?"_ the answer is always no, unless you know how to prove it is or at least come close to doing so. – Passer By Mar 01 '23 at 13:49
  • Maybe to be merged with https://stackoverflow.com/questions/367633/what-are-all-the-common-undefined-behaviours-that-a-c-programmer-should-know-a – pptaszni Mar 01 '23 at 13:53
  • btw you can have a function declared to return an `int` but that never returns an `int` without undefined behavior in c++ when you throw an exception. It is not possible to decide if a function will eventually return something or not or throw an exception for all possible functions one can write – 463035818_is_not_an_ai Mar 01 '23 at 14:04
  • There are countless of duplicates to this question. – Lundin Mar 01 '23 at 14:11
  • 2
    "I tied 600 pounds to a rope that the manufacturer rates to only hold 500 pounds. The rope didn't break. I bought a second rope from the same manufacturer, same rating, tied 600 pounds to it, and the rope broke immediately. Is there anything wrong with the second rope?" -- You invoked "undefined behavior" on those ropes --- anything could have happened. – PaulMcKenzie Mar 01 '23 at 14:49

1 Answers1

9

It's more than "bad form", it's undefined behavior.

Your function is failing to return a value when declared to do so, and you're attempting to use that return value. This triggers undefined behavior, which in the -O3 case causes a crash.

This is spelled out in section 6.9.1p12 of the C standard:

If the } that terminates a function is reached, and the value of the function call is used by the caller, the behavior is undefined.

So to answer your question, not it's not a compiler bug. You're just doing something you shouldn't.

dbush
  • 205,898
  • 23
  • 218
  • 273
  • 5
    Note: C++ standard made this requirement stronger and it's UB no matter if caller uses the result or not. – Yksisarvinen Mar 01 '23 at 13:46
  • the formulation talking about the "`}` that terminates a function" is somwhat funny. I never thought about a `return` exiting the function via the `}`. I suppose its a c++ thingy where you rather consider the implicitly called destructors and imagine how control goes up rather than down – 463035818_is_not_an_ai Mar 01 '23 at 13:55
  • Understood. I added a note that when compiling the code with `gcc` instead of `g++`, the issue disappears - so basically... "random" results. I'll make a note to use UBSan henceforth. – ttsiodras Mar 01 '23 at 14:02
  • @ttsiodras its not actually "random". With same compiler, same version, same flags, most of the time with ub the outcomes is reproducible. There is just no guarantee for that reproducibility. – 463035818_is_not_an_ai Mar 01 '23 at 14:07