16

In C++, what happens when a function that is supposed to return an object ends without a return statement? What gets returned?

e.g.

std::string func() {}
songyuanyao
  • 169,198
  • 16
  • 310
  • 405
Matt Munson
  • 2,903
  • 5
  • 33
  • 52
  • 18
    Undefined behavior. – πάντα ῥεῖ Aug 24 '16 at 08:37
  • 7
    @πάνταῥεῖ, I've never been more disappointed with the current standard until now. – StoryTeller - Unslander Monica Aug 24 '16 at 08:38
  • 3
    Another case of "undefined behavior" which could easily be reported as a compiler error. Sometimes it is a warning: "Not all control paths return a value". – BitTickler Aug 24 '16 at 08:41
  • 6
    @BitTickler Sometimes, you can prove by business logic that a control path will never be reached, but the compiler's static analyser can't. Combine this with a return type which is syntactically difficult to construct, or even impossible to construct in the function in question (private ctors etc.) and you have a hard-to-workaround error which is not really an error in your case. – Angew is no longer proud of SO Aug 24 '16 at 08:49
  • 1
    i experienced this to be an error when compiling with msvc and a warning with g++ – tobilocker Aug 24 '16 at 09:01
  • 1
    @BitTickler - It is easy to detect the problem in an empty function, like in this question. A real function might call other functions, some of which might throw an exception 10 call levels deep. In that case, some control paths will never reach the end of the function and so doesn't formally need to have a return statement. For the compiler it's [the halting problem](https://en.wikipedia.org/wiki/Halting_problem) all over again. – Bo Persson Aug 24 '16 at 09:39
  • 1
    For another example, Java "resolves" this by allowing `null` to be a member of almost every type, so that you can shove redundant return statements in when the compiler demands them. That way the compiler can reject code even though it's not sure the end of the function is reachable. So it's disappointing either way ;-) – Steve Jessop Aug 24 '16 at 10:11
  • 2
    @Angew I strongly disagree to that assessment. Both the number of paths inside a function and the existence of a return statement or the lack thereof is not overly complicated to realize. As for Bo Persson Exceptions have no impact on my statement as the stack unwinding after a throw is unrelated to the code generated for a function call. Sub-functions do not change the fact that a path inside a function has no return statement.(setjump()/longjump() and other hacks) aside. – BitTickler Aug 24 '16 at 11:11
  • 4
    @BitTickler You misunderstood me. I was saying that sometimes, you can have a path without `return` which looks reachable, but is actually not, because of contexts invisible to the compiler (such as call sites). And the return type can be such that creating an artificial `return` statement can be difficult. – Angew is no longer proud of SO Aug 24 '16 at 11:17
  • 1
    @Angew I would need an example to be sure I do not still misunderstand. To me it looks simple enough. The control flow in a function is a graph. The leaves of that graph are the ends (last instruction) of the function (1..*). For each of those leaves, I am confident it holds that it should be one of: [throw, return, exit, longjump]. Nothing else comes to my mind. Paths which are not reachable in general are dead code and do not even expose undefined behavior - they do not behave at all ;) – BitTickler Aug 24 '16 at 13:39
  • 1
    @BitTickler All I'm saying is that there can be a situation where the programmer knows (but a static analyser doesn't) that leaf *L* is unreachable, and neither `return` nor `throw` can realistically be added to such a leaf. You are right that this could be solved by artifically putting a `std::exit` (or even `longjump`) there, commented as "never happens." I don't think I'd want the standard to force me to do that, though. – Angew is no longer proud of SO Aug 24 '16 at 13:49
  • 2
    @BitTickler A good example would be `std::string foo() { while(neverReturnsFalse()) { doSomething(); } }` That function never actually reaches the end of the function, but there's no way for the compiler to know this without solving the halting problem. If the compiler enforced a return statement at the end, you'd have to have code to construct a new object to return, even though it's dead code. Worse, some object types are particularly hard to construct meaningfully without engaging in more undefined behavior. You can imagine the trouble caused by needing to return a... – Cort Ammon Aug 24 '16 at 14:18
  • 1
    ... factory created object, with constructors that you aren't intended to use (or might even be private), and having to construct one anyways just because the language forces you to define a behavior that you, as the developer, know can never occur. – Cort Ammon Aug 24 '16 at 14:21
  • It shouldn't compile :) – machine_1 Aug 24 '16 at 14:57
  • @CortAmmon Quite a constructed example. Poor user of that function declares a variable and the assignment and then spends his evening trying to find out why his program hangs ;) But a valid point nevertheless. Even though the correct way to fix this function for me would be to make it return void. Even if this is part of an interface which is implemented multiple times, this would not make sense. One implementation loops, the next returns a string... hard to fathom how to use such a .... thing :) – BitTickler Aug 24 '16 at 17:13
  • @BitTickler Spec writing is interesting that way. You have to protect people doing silly things because your decisions are so very far reaching and so iron clad. I'm reminded of the case in Java where they had you specify all exceptions that a method could throw, and you were obliged to handle all of them when you called that function. Of course, you could derive from `RuntimeException` to create an "unchecked exception," which they had to do for some built-in errors like floating point exceptions. The result: nearly 100% of exceptions in java are unchecked because people hated... – Cort Ammon Aug 24 '16 at 17:37
  • ... all of the extra syntax they had to add to do checked exceptions. In theory the idea of "oh, the compiler will help you remember to catch exceptions" sounded great, but in practice, nobody wanted it. – Cort Ammon Aug 24 '16 at 17:38
  • @CortAmmon Many Java programmers want and like checked exceptions. But you aren't so much supposed to catch them, you are supposed to add them to *throws*... Languages without checked exceptions invite many programmers to write those horrible "catch everything"-blocks, because compiler doesn't tell them which exceptions they need to worry about, and checking docs for every method called is so tedious. – hyde Aug 24 '16 at 20:43
  • 1
    @hyde From what I have gathered from the long debate on checked vs. unchecked, most people who advocate for checked exceptions admit that they are in the minority. I probably should have been more careful when I said "nobody wanted it," because that was hyperbole on my part. – Cort Ammon Aug 24 '16 at 21:28

3 Answers3

28

What gets returned?

We don't know. According to the standard, the behavior is undefined.

§6.6.3/2 The return statement [stmt.return]:

(emphasis mine)

Flowing off the end of a constructor, a destructor, or a function with a cv void return type is equivalent to a return with no operand. Otherwise, flowing off the end of a function other than main (basic.start.main) results in undefined behavior.

In fact most compilers would give a warning for it, like Clang:

warning: control reaches end of non-void function [-Wreturn-type]

songyuanyao
  • 169,198
  • 16
  • 310
  • 405
  • 1
    Will the compiler sometimes try to build an object with the default constructor? – Matt Munson Aug 24 '16 at 08:54
  • 1
    @MattMunson It could, if it wants. It's just undependable. – songyuanyao Aug 24 '16 at 09:01
  • 6
    @MattMunson no. In practice, the return value is either contained in a register or pointed to by one. In the former case the register may or may not contain the correct value. In the latter case, the pointer will either be pointing to the wrong place, or to the correct (but uninitialised) memory. Either way, it's bad. – Richard Hodges Aug 24 '16 at 10:24
  • 1
    @MattMunson That's an *awful* idea. It would make the code so much harder to debug. It's better to have trash returned so that you end up triggering a segfault or some other error and at least you know that something is wrong somewhere and you can start debugging step-by-step. Even better is using `-Wall -Werror` which would make the compiler fail at compile time when detecting this. – Bakuriu Aug 24 '16 at 11:41
7

In C++, what happens when a function that is supposed to return an object ends without a return statement?

It causes undefined behavior. No one can tell what exactly will happen.

πάντα ῥεῖ
  • 1
  • 13
  • 116
  • 190
  • interesting, I thought it might return an object built from the default constructor. Though, I suppose not all classes have one. – Matt Munson Aug 24 '16 at 08:43
  • @MattMunson That would require `std::string func() { return std::string(); }` – πάντα ῥεῖ Aug 24 '16 at 08:44
  • The compiler writers can probably tell. Also, you could tell by running the program and seeing for yourself. "Undefined behaviour" *does not mean* "it's impossible to tell what happens". – user253751 Aug 24 '16 at 10:39
  • 6
    @immibis It does mean "it's impossible to predict what will happen," though. – Angew is no longer proud of SO Aug 24 '16 at 11:18
  • 4
    Re: "No one can tell what exactly will happen": no, that's overstated. It means **only** that the **language definition** doesn't tell you what happens. Your compiler documentation might tell you. – Pete Becker Aug 24 '16 at 12:35
  • Although it's entirely possible - and legal, according to the standard - that the "undefined behavior" is the computer catching fire. – RoadieRich Aug 24 '16 at 12:38
  • @Angew It does not mean that. See Pete's comment. – user253751 Aug 24 '16 at 21:12
  • 1
    @RoadieRich Legal according to the standard, not legal according to any CPU architecture I'm aware of. Even the so-called "halt and catch fire" instruction is not literal. – user253751 Aug 25 '16 at 05:52
  • 1
    @immibis Take a look at this [LLVM blog about UB](http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html). The point is that yes, for *some* UB, the compiler might provide a definition. But in general, results of UB are arbitrary and thus unpredictable. It can be affected by changes in seemingly unrelated code, it can be affected by register/cache state at the time it happens, ... Perhaps it's actually deterministic, but in no way which would be logically predictable. – Angew is no longer proud of SO Aug 25 '16 at 06:48
5

I was curious, so I made a few tests on Visual C++ 2015.

int f()
{
    if (false)
        return 42;

    // oops
}

int main()
{
    int i = f();
}

I had to add the if to get a warning instead of a hard error:

> cl /nologo /FAs /c a.cpp
a.cpp(6) : warning C4715: 'f': not all control paths return a value

The assembly code that's generated is pretty simple and I've removed the irrelevant parts. Here's the meat of f():

f:
    xor eax, eax
    je label
    mov eax, 42
label:
    ret

The xor line is basically eax=0. Because if (false) is a constant condition, the generated code doesn't even bother to do a comparison and will then jump unconditionally to label, which just returns from the function. You can see that the "return value" (42) would actually be stored in eax, but that this line won't ever be executed. Therefore, eax == 0.

Here's what main() does:

    call f
    mov _i$[ebp], eax
    ret

It calls f() and blindly copies eax into a location on the stack (where i is). Therefore, i == 0.

Let's try something more complicated with an object and a constructor:

struct S { int i=42; };

S f()
{
    if (false)
        return {};

    // oops
}

int main()
{
    S s = f();
}

What main() does is basically reserve sizeof(S) bytes on the stack, put the address of the first byte in eax and then call f():

    lea eax, _s$[ebp]
    push eax
    call f

Again, f() won't do anything, as it will unconditionally jump to the end of the function:

f:
    xor eax, eax
    je label
    ; some stuff
    ; call S::S(), which would set i to 42
    ; but none of that will happen
label:
    ret

So what happened to the sizeof(S) bytes in main? They were never changed. They contain whatever was already in memory at that particular location. They contain garbage.

This is with an unoptimized build, on a given version of a given compiler. Change the compiler, change the behaviour. Enable the optimizer, drastically change the behaviour.

Don't do it.

Community
  • 1
  • 1
isanae
  • 3,253
  • 1
  • 22
  • 47