2

I want to iterate over all possible values of signed int, from INT_MIN to INT_MAX. The obvious code

for (int x = INT_MIN; x <= INT_MAX; x++)
    do_something(x);

is broken due to undefined behavior caused by signed integer overflow. What is worth noting is that so is its apparently clever variant

int x = INT_MIN;
do {
    do_something(x);
} while (x++ < INT_MAX);

What is interesting is that in this second example clang -O2 -fsanitize=undefined correctly reports the runtime error: signed integer overflow: 2147483647 + 1 cannot be represented in type 'int', while gcc -O2 -fsanitize=undefined does not. I imagine gcc's -fsanitize=signed-integer-overflow is as broken as -ftrapv.

The only ways I could find to express this iteration are with a goto:

int x = INT_MIN;
loop: {
    do_something(x);
    if (x < INT_MAX) {
        x++;
        goto loop;
    }
}

using an infinite loop and breaking manually:

int x = INT_MIN;
while (1) {
    do_something(x);
    if (x < INT_MAX)
        x++;
    else
        break;
}

and exploiting the short-circuit evaluation to increment the variable only when it's safe to do so:

int x = INT_MIN;
do {
    do_something(x);
} while (x < INT_MAX && ++x != INT_MIN);

As pointed out by Steve Jessop, in the do-while solution one can use while (x < INT_MAX && (++x, 1)); instead.

Among the three version, I'm not particularly against the goto version, I don't like the do-while version very much (especially the ++x != INT_MIN bit...), but all in all I prefer the while (1) version. My question is the following.

Is a C compiler allowed to completely eliminate the infinite while (1) loop in case do_something doesn't perform input/output operations, nor accesses volatile objects?

I know that it cannot for C11, but I'm not so sure about previous versions.

Edit

I will try to describe more in detail what I'm worried about. Let's say that I'm using this loop to validate the correctness of some other code. For instance, I might do

#include <stdio.h>
#include <limits.h>

int add_wrap_unsigned(int x, int y) {
    return (unsigned) x + (unsigned) y;
}

int add_wrap_builtin(int x, int y) {
    int res;
    __builtin_add_overflow(x, y, &res);
    return res;
}

int check() {
    int x = INT_MIN;
    while (1) {
        if (add_wrap_unsigned(x, 1) != add_wrap_builtin(x, 1))
            return 1;
        if (x < INT_MAX)
            x++;
        else
            break;
    }
    return 0;
}

int main() {
    if (check())
        printf("counterexample found");
    else
        printf("no counterexample");
    return 0;
}

This correctly reports no counterexample with clang. check gets compiled to a simple return 0. However, changing a single line (line 22 from break; to x = INT_MIN;) changes completely the behavior: check becomes a noop infinite loop, but main now prints counterexample found. All this happens even with -std=c11.

My worry is that I cannot trust the first version, because the compiler might not have done the correct thing, as in the second case.

Federico
  • 582
  • 3
  • 12
  • 1
    [Very related](https://stackoverflow.com/questions/2457389/looping-on-a-closed-range) but for C++ and some answers are using C++ constructs. – AProgrammer Dec 13 '19 at 17:59
  • @AProgrammer Oh that's great. In particular there is the `do {} while (begin != end && (++begin, true));` solution! I'm happy I have found it on my own :-) – Federico Dec 13 '19 at 18:01
  • Can you change `int` to `long int` or `int64_t`. – Fiddling Bits Dec 13 '19 at 18:03
  • @FiddlingBits What do you mean? Of course you can change it, by using the appropriate constants too. – Federico Dec 13 '19 at 18:05
  • @Federico Yes, but if you don't change the constants you won't overflow, right? – Fiddling Bits Dec 13 '19 at 18:09
  • No, with the last three snippets you never overflow. You can use whatever signed type for x, as long as you use also the appropriate constants from [limits.h](https://www.tutorialspoint.com/c_standard_library/limits_h.htm) or [stdint](https://pubs.opengroup.org/onlinepubs/009695399/basedefs/stdint.h.html). – Federico Dec 13 '19 at 18:11
  • My question is whether a compiler is allowed to mess up the `while (1)` solution, by completely eliminating the infinite loop. – Federico Dec 13 '19 at 18:12
  • The overflow behaviour for (signed) `int` is implementation defined. `++x != INT_MIN` assumes the behaviour is to wrap cleanly, as `unsigned` would. – Weather Vane Dec 13 '19 at 18:23
  • @WeatherVane The overflow for signed integers is __undefined behavior__, not implementation defined. `++x != INT_MIN` does not overflow because `x` is incremented only if `x < MAX_INT`. This is because `&&` is short-circuiting. `++x != INT_MIN` is just an expression which increments `x` and then evaluates to true. It doesn't assume wrapping of signed overflow. – Federico Dec 14 '19 at 11:09
  • @Federico the *only* thing the compiler can do is to assume that your loop *halts*, if the rules match, and that can happen only when the controlling expression is non-constant. If you need the values in any way then the compiler must calculate those values too... – Antti Haapala -- Слава Україні Dec 14 '19 at 16:21
  • 1
    There is a really good answer to your questions - `How to iterate over all possible values of int?` and `Is a C compiler allowed to completely eliminate the infinite loop` which already are two questions. Now you edit your qiestion and posted another question about specific compiler optimization of your code. Please only one question per question. Please either ask about how to iterate, about what compiler is allowed to do or why doesn't your code does not work. I think I would vote on asking a separate question about your edit. – KamilCuk Dec 14 '19 at 18:09

1 Answers1

2

The question is whether the C compiler is allowed to remove an infinite loop. No, it is not allowed to remove an infinite loop whose controlling expression is constant (for (;;) {} or while (1) { }), but the C standard explicitly states in C11/C17 6.8.5p6 that

  1. An iteration statement whose controlling expression is not a constant expression,156) that performs no input/output operations, does not access volatile objects, and performs no synchronization or atomic operations in its body, controlling expression, or (in the case of a for statement) its expression-3, may be assumed by the implementation to terminate.157)

I.e. the compiler is allowed to optimize out a loop that does not have any side effects and that does have a non-constant controlling expression. I.e. it is an utterly useless loop when it comes to the as-if rule. This exception has been granted in the C11 revision of the standard. This did not apply before C11 so if that behaviour was present then then it wasn't standards-compliant.


You do not need to guard the increment, just break:

for (int x = INT_MIN; ; x++) {
    do_something(x);
    if (x == INT_MAX) break;
}

In this case, the controlling expression is a constant expression (implicit true value), and hence must not be optimized out as per C11/C17 6.8.5p6, but the compiler must prove it explicitly to terminate.


Now what this gets optimized to depends on the compiler and the flags - with -O3 Clang 9.0 compiles it to the equivalent of

for (int x = INT_MIN; x != INT_MIN; x++)
    do_something(x)

if the overflow were defined with wraparound - but as we know, x86 machine code does have well-defined wraparound.

The behaviour of the machine code produced by GCC 9.2 is similar to

int x = INT_MIN;
do_something(x);
do {
   do_something(++x);
} while (x < INT_MAX);

which written like that in C would be free of undefined behaviour and virtually indistinguishable in performance from that of by Clang (but not in size).

  • Thanks for the answer. The construct you provide is very nice and clean. Also the analysis of the assembly generated is correct. It doesn't answer my question however, does it? My question is specifically whether a C optimizing compiler is allowed to remove the infinite loop `while (1)`. If you are not familiar with the phenomenon, I'm speaking about [this](https://blog.regehr.org/archives/140) nasty thing. – Federico Dec 14 '19 at 11:15
  • See [clang](https://gcc.godbolt.org/z/jTVw8y) and [icc](https://gcc.godbolt.org/z/8uJWRz) removing the loop. [gcc](https://gcc.godbolt.org/z/Vq3-Ti) does not, however. – Federico Dec 14 '19 at 11:37
  • @Federico it is a completely different thing. You didn't reference that in the question at all. It is not a bug though... let me address it... – Antti Haapala -- Слава Україні Dec 14 '19 at 16:12