2

Suppose you wrote a function in c++, but absentmindedly forgot to type the word return. What would happen in that case? I was hoping that the compiler would complain, or at least a segmentation fault would be raised once the program got to that point. However, what actually happens is far worse: the program spews out rubbish. Not only that, but the actual output depends on the level of optimization! Here's some code that demonstrate this problem:

#include <iostream>
#include <vector>

using namespace std;

double max_1(double n1,
         double n2)
{
  if(n1>n2)
    n1;
  else
    n2;
}

int max_2(const int n1,
      const int n2)
{
  if(n1>n2)
    n1;
  else
    n2;
}

size_t max_length(const vector<int>& v1,
          const vector<int>& v2)
{
  if(v1.size()>v2.size())
    v1.size();
  else
    v2.size();
}

int main(void)
{
  cout << max_1(3,4) << endl;
  cout << max_1(4,3) << endl;

  cout << max_2(3,4) << endl;
  cout << max_2(4,3) << endl;

  cout << max_length(vector<int>(3,1),vector<int>(4,1)) << endl;
  cout << max_length(vector<int>(4,1),vector<int>(3,1)) << endl;

  return 0;
}

And here's what I get when I compile it at different optimization levels:

$ rm ./a.out; g++ -O0 ./test.cpp && ./a.out
nan
nan
134525024
134525024
4
4
$ rm ./a.out; g++ -O1 ./test.cpp && ./a.out
0
0
0
0
0
0
$ rm ./a.out; g++ -O2 ./test.cpp && ./a.out
0
0
0
0
0
0
$ rm ./a.out; g++ -O3 ./test.cpp && ./a.out
0
0
0
0
0
0

Now imagine that you're trying to debug the function max_length. In production mode you get the wrong answer, so you recompile in debug mode, and now when you run it everything works fine.

I know there are ways to avoid such cases altogether by adding the appropriate warning flags (-Wreturn-type), but I'm still have two questions

  1. Why does the compiler even agree to compile a function without a return statement? Is this feature required for legacy code?

  2. Why does the output depend on the optimization level?

Shafik Yaghmour
  • 154,301
  • 39
  • 440
  • 740
user2535797
  • 529
  • 3
  • 10
  • 3
    1. It is not easy to prove that a function does not have a return for all paths. 2. It is undefined behaviour, so you can't expect any specific output. – juanchopanza Oct 07 '14 at 13:42
  • 1
    Related to [Why does this C++ snippet compile (non-void function does not return a value)](http://stackoverflow.com/q/20614282/1708801). Basically since it is undefined behavior your results are unpredictable, compilers are notorious for taking advantage of UB during optimization. It is hard to detect this in all cases. – Shafik Yaghmour Oct 07 '14 at 13:42
  • do you pay attention to the warnings? Pass -Werror flag to gcc and it will not compile. – Slava Oct 07 '14 at 13:48
  • 1
    This article by John Regehr: [Finding Undefined Behavior Bugs by Finding Dead Code](http://blog.regehr.org/archives/970) gives some interesting examples of compiler taking advantage of UB. – Shafik Yaghmour Oct 07 '14 at 13:49
  • @Slava the OP knows how to make the compiler warn about this, the OP is asking why does the compiler even allow it, which is fair question is you are unaware of undefined behavior. – Shafik Yaghmour Oct 07 '14 at 13:50
  • @juanchopanza why is it hard to prove a function does not return in some cases? Couldn't it just replace each if with a possible path and check all boolean combinations possible? – rubenvb Oct 07 '14 at 13:53
  • 1
    @juanchopanza I understand it might be hard at compile time, but surely during runtime the program can notice it has reached the end of the function before a return statement was encountered. – user2535797 Oct 07 '14 at 13:55
  • 1
    It isn't that obvious. It would have to be instrumented, and that would carry a cost. – juanchopanza Oct 07 '14 at 14:07
  • Note, the current accepted answer is incorrect as the comments by Neil point out, although the issue he points out is not the only one. – Shafik Yaghmour Oct 09 '14 at 02:24

2 Answers2

9

This is undefined behavior to drop off the end of the value returning function, this is covered in the draft C++ standard section `6.6.31 The return statement which says:

Flowing off the end of a function is equivalent to a return with no value; this results in undefined behavior in a value-returning function.

The compiler is not required to issue a diagnostic, we can see this from section 1.4 Implementation compliance which says:

The set of diagnosable rules consists of all syntactic and semantic rules in this International Standard except for those rules containing an explicit notation that “no diagnostic is required” or which are described as resulting in “undefined behavior.”

although compiler in general do try and catch a wide range of undefined behaviors and produce warnings, although usually you need to use the right set of flags. For gcc and clang I find the following set of flags to be useful:

-Wall -Wextra -Wconversion -pedantic

and in general I would encourage you to turn warnings into errors using -Werror.

Compiler are notorious for taking advantage of undefined behavior during the optimization stages, see Finding Undefined Behavior Bugs by Finding Dead Code for some good examples including the infamous Linux kernel null pointer check removal where in processing this code:

struct foo *s = ...;
int x = s->f;
if (!s) return ERROR;

gcc inferred that since s was deferenced in s->f; and since dereferencing a null pointer is undefined behavior then s must not be null and therefore optimizes away the if (!s) check on the next line (copied from my answer here).

Since undefined behavior is unpredictable, then at more aggressive settings the compiler in many cases will do more aggressive optimizations many of them may not make much intuitive sense but, hey it is undefined behavior so you should have no expectations anyway.

Note, that although there are many cases the compiler can determine a function is not properly returning in the general case this is the halting problem. Doing this at run-time automatically would carry a cost which violates the don't pay for what you don't use philosophy. Although both gcc and clang implement sanitizers to check for things like undefined behavior, for example using the -fsanitize=undefined flag would check for undefined behavior at run-time.

Community
  • 1
  • 1
Shafik Yaghmour
  • 154,301
  • 39
  • 440
  • 740
-1

You may want to check out this answer here

The just of it is that the compiler allows you to not have a return statement since there are potentially many different execution paths, ensuring each will exit with a return can be tricky at compile time, so the compiler will take care of it for you.

Things to remember:

if main ends without a return it will always return 0.

if another function ends without a return it will always return the last value in the eax register, usually the last statement

optimization changes the code on the assembly level. This is why you are getting the weird behavior, the compiler is "fixing" your code for you changing when things are executed giving a different last value, and thus return value.

Hope this helped!

Community
  • 1
  • 1
Jared Wadsworth
  • 839
  • 6
  • 15
  • 1
    If another function ends without a return it is undefined behaviour. – Neil Kirk Oct 07 '14 at 17:53
  • Right, its undefined because you cannot guarantee that the last statement executed is what is in the eax register, but it is the register used when returning, and thus is what is grabbed by the caller function. unless of course its of a non integral type, but then it is the XMM or FPU-stack or whatever the type is. So yes its undefined, but usually it is the result of the last statement. – Jared Wadsworth Oct 07 '14 at 18:22
  • 1
    No, it is just undefined. It's nothing to do with eax register. It may be the case that it is usually the last value of the eax register on YOUR machine and YOUR compilers, but I could make a compiler return 47 every time this happens and it would be 100% standard-conforming. – Neil Kirk Oct 07 '14 at 18:59