Is while(1); undefined behavior in C?

Question

In C++11 is it Undefined Behavior, but is it the case in C that while(1); is Undefined Behavior?

I guess if `for(;;)` [statement is well defined in C](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf) then `while(1)` should not be Undefined in C.... remember detection of infinite loop is Undecidable problem.. — Grijesh Chauhan, May 08 '13 at 10:10
If you like I could elaborate a bit more on 6.8.5 ad 6 and especially why it is very unlikely that the compiler company I work for will make use of this clause. — Bryan Olivier, May 08 '13 at 10:46
Possible duplicate of [Is an (empty) infinite loop undefined behavior in C?](https://stackoverflow.com/questions/15595493/is-an-empty-infinite-loop-undefined-behavior-in-c) — jinawee, Jan 16 '19 at 17:36

score 35 · Accepted Answer · edited Jun 20 '20 at 09:12

It is well defined behavior. In C11 a new clause 6.8.5 ad 6 has been added

An iteration statement whose controlling expression is not a constant expression,¹⁵⁶⁾ that performs no input/output operations, does not access volatile objects, and performs no synchronization or atomic operations in its body, controlling expression, or (in the case of a for statement) its expression-3, may be assumed by the implementation to terminate.¹⁵⁷⁾

¹⁵⁷⁾_{This is intended to allow compiler transformations such as removal of empty loops even when termination cannot be proven.}

Since the controlling expression of your loop is a constant, the compiler may not assume the loop terminates. This is intended for reactive programs that should run forever, like an operating system.

However for the following loop the behavior is unclear

a = 1; while(a);

In effect a compiler may or may not remove this loop, resulting in a program that may terminate or may not terminate. That is not really undefined, as it is not allowed to erase your hard disk, but it is a construction to avoid.

There is however another snag, consider the following code:

a = 1; while(a) while(1);

Now since the compiler may assume the outer loop terminates, the inner loop should also terminate, how else could the outer loop terminate. So if you have a really smart compiler then a while(1); loop that should not terminate has to have such non-terminating loops around it all the way up to main. If you really want the infinite loop, you'd better read or write some volatile variable in it.

Why this clause is not practical

It is very unlikely our compiler company is ever going to make use of this clause, mainly because it is a very syntactical property. In the intermediate representation (IR), the difference between the constant and the variable in the above examples is easily lost through constant propagation.

The intention of the clause is to allow compiler writers to apply desirable transformations like the following. Consider a not so uncommon loop:

int f(unsigned int n, int *a)
{       unsigned int i;
        int s;
        
        s = 0;
        for (i = 10U; i <= n; i++)
        {
                s += a[i];
        }
        return s;
}

For architectural reasons (for example hardware loops) we would like to transform this code to:

int f(unsigned int n, int *a)
{       unsigned int i;
        int s;
        
        s = 0;
        for (i = 0; i < n-9; i++)
        {
                s += a[i+10];
        }
        return s;
}

Without clause 6.8.5 ad 6 this is not possible, because if n equals UINT_MAX, the loop may not terminate. Nevertheless it is pretty clear to a human that this is not the intention of the writer of this code. Clause 6.8.5 ad 6 now allows this transformation. However the way this is achieved is not very practical for a compiler writer as the syntactical requirement of an infinite loop is hard to maintain on the IR.

Note that it is essential that n and i are unsigned as overflow on signed int gives undefined behavior and thus the transformation can be justified for this reason. Efficient code however benefits from using unsigned, apart from the bigger positive range.

An alternative approach

Our approach would be that the code writer has to express his intention by for example inserting an assert(n < UINT_MAX) before the loop or some Frama-C like guarantee. This way the compiler can "prove" termination and doesn't have to rely on clause 6.8.5 ad 6.

P.S: I'm looking at a draft of April 12, 2011 as paxdiablo is clearly looking at a different version, maybe his version is newer. In his quote the element of constant expression is not mentioned.

I'm looking at n1570, too, and I assure you that paxdiablo's quote is there, at the end of the page numbered 150 (168 in Adobe Reader page numbers)... — autistic, May 08 '13 at 11:00
@undefinedbehaviour I just downloaded n1570 and it still has the version in my quote of the clause, where an exception is made for "whose controlling expression is not a constant expression". But as I argue above, it doesn't really help. — Bryan Olivier, May 08 '13 at 11:35
Ah. I hadn't noticed that addition. Very well. The one you're looking at is the most current C11 standard draft. — autistic, May 08 '13 at 11:42
The compiler is already forced to keep track of whether a propagated constant is a constant expression for other reasons. For instance, `sizeof(*(char (*)[1])a++)` does not increment `a`, but `sizeof(*(char (*)[non_constexpr_1])a++)` does. — R.. GitHub STOP HELPING ICE, May 08 '13 at 14:12
@R.. That is some obscure code, I'll have to dive into it. But I'm pretty sure that this can be resolved in the front-end and that the difference does not migrate into the IR. — Bryan Olivier, May 08 '13 at 14:23
The distinction between a constant-expression zero and non-constant-expression zero is also important for null pointer constants, which affect the resulting type of a ternary operator expression. — R.. GitHub STOP HELPING ICE, May 08 '13 at 15:41
@R.. Once the type is settled either in or directly after the front-end, constant propagation can be freely applied. Ergo: no constant propagation inside the front-end. Bad idea anyway. Or did I miss the gist of your remark? — Bryan Olivier, May 08 '13 at 16:44
The C++ version of the thread makes a strong case that "may be assumed to terminate" means that if it would not actually terminate then the code has undefined behaviour. — M.M, Jun 06 '15 at 05:08
@MattMcNabb: It's too bad the C standard doesn't specify what a compiler is allowed to do on the basis of such an assumption; it would have been much clearer to say "The execution of any code may be deferred until its first visible side-effect. Unless a loop *syntactically* guaranteed to be infinite, the time required for execution shall not be considered an observable side-effect, even if it is infinite". That would make clear that if code which followed the loop would invoke Undefined Behavior without any control or value dependencies on anything done in the loop... — supercat, Jul 06 '15 at 16:43
...such code could be "executed" before the loop, but if a program received inputs which would cause it to loop indefinitely and even with the rescheduling rule there would be no means by which the program could legitimately invoke Undefined Behavior, behavior of the program with that input would be mostly defined (certain details of operation may be unspecified). Making the endless loop trigger UB in and of itself would require that programs whose purpose is to search for a solution until one is found or an operator pulls the plug would need to include some "dummy side-effect" to... — supercat, Jul 06 '15 at 16:50
...prevent UB in case the problem had no solution. Adding such side-effects might not be overly difficult for a programmer, but may substantially impair the kinds of optimizations a compiler would be allowed to perform when generating code to actually look for a solution. Unfortunately, given the Standard text as it is, I don't know any strictly-conforming way to write loops that can't be proven to terminate except by adding such silly dummy side-effects. — supercat, Jul 06 '15 at 16:54
There's an interesting wrinkle here - what exactly is meant by 'controlling expression'? In particular, what about a while(1) loop with a conditional break or return ? — TLW, Oct 18 '17 at 01:47

unwind · Answer 2 · 2013-05-08T09:14:20.033

After checking in the draft C99 standard, I would say "no", it's not undefined. I can't find any language in the draft that mentions a requirement that iterations end.

The full text of the paragraph describing the semantics of the iterating statements is:

An iteration statement causes a statement called the loop body to be executed repeatedly until the controlling expression compares equal to 0.

I would expect any limitation such as the one specififed for C++11 to appear there, if applicable. There is also a section named "Constraints", which also doesn't mention any such constraint.

Of course, the actual standard might say something else, although I doubt it.

The forward progress guarantee was added in C11 (N1570) – M.M Nov 28 '19 at 01:03 — M.M, Nov 28 '19 at 01:03

paxdiablo · Answer 3 · 2016-04-06T01:13:46.860

1

The following statement appears in C11 6.8.5 Iteration statements /6:

An iteration statement whose controlling expression is not a constant expression, that performs no input/output operations, does not access volatile objects, and performs no synchronization or atomic operations in its body, controlling expression, or (in the case of a for statement) its expression-3, may be assumed by the implementation to terminate.

Since while(1); uses a constant expression, the implementation is not allowed to assume it will terminate.

A compiler is free to remove such a loop entirely is the expression is non-constant and all other conditions are similarly met, even if it cannot be proven conclusively that the loop would terminate.

edited Apr 06 '16 at 01:13

answered May 08 '13 at 08:57

paxdiablo

854,327
234
1,573
1,953

It's not exactly *free to assume it will terminate*. There would need to be further processing to ensure that the observable behaviour of the program is met. If there's no way code following the loop can be reached, the compiler would need to optimise that away, too. – autistic May 08 '13 at 11:09
@undefinedbehaviour I beg to differ. I do think that any observable behavior after the loop, that may seem unreachable because of the loop with a variable, by token of this clause may become reachable and does _not_ have to be optimized away (first). – Bryan Olivier May 08 '13 at 11:44
@R.I.P.Seb: I wish the Standard had specified what a compiler was allowed to *do* on the basis of an assumption. IMHO, what may make sense as a default would be to say that "unsigned long long test(unsigned long long a) do { a=outsideFunctionWith(a); } while(a != 1); printf("It terminated!"); printf("Result=%lld", a); return a; }" would be allowed to behave as though the "while" executed in parallel with the first printf, but the second printf [and the return from the function] would have to wait until "a" was actually assigned a value of one. If the purpose of the function... – supercat Apr 05 '16 at 16:19
...is to confirm that some function will eventually return 1, having an optimizer decide that it "must", and therefore does, would be unhelpful. – supercat Apr 05 '16 at 16:20
@BryanOlivier As I wrote, "There would need to be further processing to ensure that the observable behaviour of the program is met." Can you tell me why [this program](http://ideone.com/gJ7h0G) doesn't print anything? – autistic Apr 06 '16 at 00:52
@R.I.P.Seb, the standard says that it *may* be assumed to terminate, not that it *will* terminate - it's up to the implementation whether it actually does or not. In any case, the reason your *specific* code `for(;;); printf ...` doesn't print anything is almost certainly down to down to `C11 6.8.5.3/2: An omitted expression-2 is replaced by a nonzero constant.` - it's therefore identical to `for(;1;)` hence cannot be optimised out of existence due to the constant expression limitation. – paxdiablo Apr 06 '16 at 01:17
1

However, I thank you for drawing my attention back to this question. At some point ISO added the constant expression clause which made my answer totally wrong. Have fixed it up. – paxdiablo Apr 06 '16 at 01:20
@paxdiablo I'm glad you had the opportunity to improve your answer, but the question I posed was at Bryan Olivier, and though that's only a secondary discussion here I do feel the answer to it didn't quite fit into the line of reasoning I was trying to draw. Nonetheless, I'd be crazy to be unhappy to conclude the discussion at this point. This edit has actually made your answer very good. Shame it couldn't be written back then, eh? :( – autistic Apr 06 '16 at 01:40

autistic · Answer 4 · 2013-05-08T11:44:48.737

The simplest answer involves a quote from §5.1.2.3p6, which states the minimal requirements of a conforming implementation:

The least requirements on a conforming implementation are:

— Accesses to volatile objects are evaluated strictly according to the rules of the abstract machine.

— At program termination, all data written into files shall be identical to the result that execution of the program according to the abstract semantics would have produced.

— The input and output dynamics of interactive devices shall take place as specified in 7.21.3. The intent of these requirements is that unbuffered or line-buffered output appear as soon as possible, to ensure that prompting messages actually appear prior to a program waiting for input.

This is the observable behavior of the program.

If the machine code fails to produce the observable behaviour due to optimisations performed, then the compiler isn't a C compiler. What is the observable behaviour of a program that contains only such an infinite loop, at the point of termination? The only way such a loop could end is by a signal causing it to end prematurely. In the case of SIGTERM, the program terminates. This would cause no observable behaviour. Hence, the only valid optimisation of that program is the compiler pre-empting the system closing the program and generating a program that ends immediately.

/* unoptimised version */
int main() {
    for (;;);
    puts("The loop has ended");
}

/* optimised version */
int main() { }

One possibility is that a signal is raised and longjmp is called to cause execution to jump to a different location. It seems like the only place that could be jumped to is somewhere reached during execution prior to the loop, so providing the compiler is intelligent enough to notice that a signal is raised causing the execution to jump to somewhere else, it could potentially optimise the loop (and the signal raising) away in favour of jumping immediately.

When multiple threads enter the equation, a valid implementation might be able to transfer ownership of the program from the main thread to a different thread, and end the main thread. The observable behaviour of the program must still be observable, regardless of optimisations.

Your name is almost like a novelty account for this question. — Tony The Lion, May 08 '13 at 10:41

Is while(1); undefined behavior in C?

4 Answers4

Linked