12

Are noreturn attributes on never-returning functions necessary, or is this just an (arguably premature? -- at least for exits, I can't imagine why optimize there) optimization?

It was explained to me that in a context such as

void myexit(int s) _Noreturn {
   exit(s);
}
// ...
if (!p) { myexit(1); } 
f(*p);
/// ...

noreturn prevents the !p branch from being optimized out. But is it really permissible for a compiler to optimize out that branch? I realize the rationale for optimizing it out would be: "Undefined behavior can't happen. If p == NULL, dereferencing it is UB, therefore p can never be NULL in this context, therefore the !p branch does not trigger". But can't the compiler resolve the problem just as well by assuming that myexit could be a function that doesn't return (even if it's not explicitly marked as such)?

Toby Speight
  • 27,591
  • 48
  • 66
  • 103
Petr Skocik
  • 58,047
  • 6
  • 95
  • 142
  • I'm hesitant to mark this C++ too because of downvote happy purists, but C++ has noreturn attributes (`[[noreturn]]`) too, and a C/C++ answer would be most welcome. – Petr Skocik Jul 18 '16 at 07:45
  • 2
    _"noreturn prevents the `!p` branch from being optimized out."_ why? – mvidelgauz Jul 18 '16 at 07:50
  • Possible duplicate - http://stackoverflow.com/questions/10538291/what-is-the-point-of-noreturn – mvidelgauz Jul 18 '16 at 07:52
  • @mvidelgauz That cannot be a dup, because of the tags. Please be attentive. – 2501 Jul 18 '16 at 07:54
  • I don't believe it's a duplicate of the linked question. I'm not asking what noreturn attributes are good for or whether they are nice. I'm asking whether it's **necessary** to apply them to functions that exit. The rationale for possibly optimizing out the `!p` branch is in the question and whether or not that's a good rationale is part of the question. – Petr Skocik Jul 18 '16 at 07:58
  • Ok, I have read that rationale but I still can't get it, sorry (i.e. what exactly makes compiler **sure** that "p can never be NULL", to me this code looks like run-time check for this condition) – mvidelgauz Jul 18 '16 at 08:07
  • 2
    _"I'm asking whether it's necessary to apply.."_ In other words, you are asking about possible situations when _not applying_ it can lead to bad results? – mvidelgauz Jul 18 '16 at 08:10
  • @mvidelgauz The rumor has it that compilers are **allowed to assume** undefined behavior never occurs and that some compilers would use that assumption to deduce possibly ill-advised optimization opportunities. There's a whole bunch of blog posts and videos on this and how it can cause problems on modern, "very smart, very optimizing" compilers. – Petr Skocik Jul 18 '16 at 08:12
  • @mvidelgauz That is correct. – Petr Skocik Jul 18 '16 at 08:12
  • But with such "optimization" compiler will remove !p branch regardless of whether there is `noreturn` or not. Am I wrong? Are you saying that `noretun` will give compiler an additional hint saying _"in case p is NULL f(*p) code won't be reached so this is useful code that should NOT be removed so that exit will happen instead of UB"_? – mvidelgauz Jul 18 '16 at 08:28
  • The assertion that p is not NULL is actually only valid below the if, so the compiler cannot remove the if statement. – martinkunev Jul 18 '16 at 08:46
  • 3
    @martinkunev: No, UB is allowed to go back in time and change previous results. 'Undefined' really has no limit here, not even nasal demons. – sp2danny Jul 18 '16 at 12:59
  • @sp2danny UB is not some fairy that changes your code. If what you're saying were true, how could you verify that a variable is not NULL? Dereferencing a NULL pointer is UB so the compiler will just ignore that possibility and optimize, asserting that when a pointer is dereferenced, it is not NULL. For good explanation of UB, I would suggest this: blog.llvm.org/2011/05/what-every-c-programmer-should-know.html – martinkunev Jul 18 '16 at 14:11
  • 4
    @martinkunev The compiler can assume that UB never occurs and therefore rearrange code in a way that would make it seem like the UB went back in time, were it to occur. See this [article](https://blogs.msdn.microsoft.com/oldnewthing/20140627-00/?p=633) for an example. – a3f Jul 19 '16 at 04:51
  • @a3f This is not the case here. The contract is the compiler produces a program with the SAME behavior as the code, if the code does not lead to UB. In the example, the code does not lead to UB (NULL pointer will never be dereferenced). The compiler can rearrange the code but it can NOT change the behavior of a conforming program. It has only the right to assume that when `f` is called, `p` is not NULL. The compiler cannot prove that the `if` statement doesn't change `p` and that `f` is always called when the `if` is reached (because this is false). So the two lines cannot be rearranged. – martinkunev Jul 19 '16 at 07:42
  • 1
    @martinkunev Whether this is or isn't the case here is the point of OP's question. Refer to my answer for my assessment. Feel free to add your answer too. I was just addressing your "UB is not some fairy that changes your code." comment. – a3f Jul 19 '16 at 07:56

2 Answers2

7

This allows for several optimizations to take place. First, for the call itself this may to allow for a simplified setup, not all registers have to be saved, a jmp instruction can be used instead of call or similar. Then the code after the call can also be optimized because there is no branching back to the normal flow.

So yes, usually _Noreturn is a valuable information to the compiler.

But as a direct answer to your question, no, this is a property for optimization, so it is not necessary.

Jens Gustedt
  • 76,821
  • 6
  • 102
  • 177
2

Axiom: The standard is the definite resource on what's well-defined in C.

  • The standard specifies assert, therefore using assert is well-defined.
  • assert conditionally calls abort, a _Noreturn function, therefore that's allowed.
  • Every usage of assert is inside a function. Therefore functions may or may not return.
  • The standard has this example:

    _Noreturn void g (int i) { // causes undefined behavior if i <= 0
        if (i > 0) abort();
    }
    

    Therefore functions conditionally returning must not be _Noreturn. This means:

    • For externally defined functions, the compiler has to assume the function might not return and isn't free to optimize out the if-branch
    • For "internally" defined functions, the compiler can check whether the function indeed always returns and optimize out the branch.

In both cases, compiled program behavior aligns with what a non-optimizing abstract C machine would do and the 'as-if' rule is observed.

a3f
  • 8,517
  • 1
  • 41
  • 46
  • Thought writing it out as kind of proof would be cool, now looking at it, I am not so sure anymore. But it answers the question I think. :-) Feel free to edit it to make it more formal. – a3f Jul 19 '16 at 04:42
  • 1
    A compiler can't assume that externally-defined functions' don't return; instead it must be prepared to handle both the cases in which the function does return (code which follows the function cannot be omitted) and also the cases where it doesn't (code which follows the function cannot affect the execution of the function or anything before it). – supercat Aug 01 '16 at 18:11
  • @supercat, reworded it. Thanks. – a3f Aug 01 '16 at 18:13