6

Consider simple code:

#include "stdio.h"

#define N 10U

int main() {
    int a[N] = {0};
    unsigned int i = N;
    int s = 0;

    // Fill a

    while(i--)
        s += a[i];

    printf("Sum is %d\n", s);

    return 0;
}

Does while loop contain undefined behavior because of integer underflow? Do compilers have right to assume that while loop condition is always true because of that and end up with endless loop?

What if i is signed int? Doesn't it contain pitfalls related to array access?

Update

I run this and similar code many times and it worked fine. Moreover, it's popular way to iterate over arrays and vectors backwards. I'm asking this question to make sure that this way is OK from point of view of standard.

At glance, it's obviously not infinite. On other hand, sometimes compiler can "optimize" away some conditions and code assuming that code contains no undefined behavior. It can lead to infinite loops and other unwanted consequences. See this.

Community
  • 1
  • 1
George Sovetov
  • 4,942
  • 5
  • 36
  • 57
  • 3
    Well, this is **not** an infinite loop. – Algirdas Preidžius Apr 08 '16 at 11:17
  • 2
    What happened when you ran it? – Martin James Apr 08 '16 at 11:17
  • No problem, since `while (0)` is false and you don't care for the value of `i` afterwards – MrTux Apr 08 '16 at 11:17
  • Huh, 2-down votes? No comments as to why? All the negativity on this question really frustrates me. This is an [exact question that I had back in the day](http://stackoverflow.com/questions/36498192/does-whilei-s-ai-contain-undefined-behavior-in-c-and-c/36498791?noredirect=1#comment60605735_36498791) and it seems like a good one to me so have a +1. – Jonathan Mee Apr 08 '16 at 12:40
  • @JonathanMee That's actually good news that GCC recognizes this case. All articles and answers I read before say that there such "optimization" is now usual behavior. – George Sovetov Apr 08 '16 at 12:40
  • @GeorgeSovetov So i spoke too soon. The issue you linked isn't a bug in gcc but rather, undefined behavior. I try to explain in the second part of [my answer](http://stackoverflow.com/a/36498791/2642059), but this is reading, and subsequently writing to memory beyond that allocated in the loop. So who knows how long you'll have to read before finding a 5. Here there be dragons and all that. – Jonathan Mee Apr 08 '16 at 17:55
  • 1
    Please ask one question per language, the rationalisation in the language standards may be quite different even if behaviour is similar. – Vality Apr 08 '16 at 17:59

6 Answers6

9

This code doesn't invoke undefined behavior. The loop will be terminated once i becomes 0.

For unsigned int, there is no integer over/underflow. The effect will be same with i as signed except there will no wrapping in this case.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
haccks
  • 104,019
  • 25
  • 176
  • 264
6

Does while loop contain undefined behavior because of integer underflow?

No, overflow/underflow is only undefined behavior in case of signed integers.

Do compilers have right to assume that while loop condition is always true because of that and end up with endless loop?

No, because the expression will eventually turn out to be zero.

What if i is signed int? Doesn't it contain pitfalls related to array access?

If it is signed and over/underflows, you invoke undefined behavior.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • `s/is only undefined behaviour/is only possible/` – Lightness Races in Orbit Apr 08 '16 at 11:23
  • @LightnessRacesinOrbit Feel free to submit a correction to the C standard committee as well, 3.4.3 `"EXAMPLE An example of undefined behavior is the behavior on integer overflow."` – Lundin Apr 08 '16 at 11:35
  • 1
    That statement is correct. But unsigned values do not overflow, so it is not pertinent to this situation. The phrase in your answer is technically correct too, but my point is that I think it suggests the wrong thing - overflow/underflow is not _defined_ behaviour for _unsigned_ integers because it is impossible! :) As such, my suggestion was editorial in nature. – Lightness Races in Orbit Apr 08 '16 at 11:47
4

The loop does not yield undefined behaviour for the following reasons.

i is initialised to 10, and decremented in the loop. When i has a value of zero, decrementing it produces a value equal to UINT_MAX - the largest value an unsigned can represent, but the loop will terminate. a[i] will only ever be accessed (within the loop) for values of i between N-1 (i.e. 9) and 0. Those are all valid indices in array a.

s and all the elements of a are initialized to zero. So all the additions add 0 to 0. That will never overflow nor underflow an int, so can never result in undefined behaviour.

If i is changed to signed int, the decrementing never underflows, and i will have a negative value when the loop terminates. The only net change is in the value that i has after the loop.

Peter
  • 35,646
  • 4
  • 32
  • 74
3

i will wrap around to ~0 (0XFFFFFFFF for 32 bits) but the loop will terminate so there is no UB.

haccks
  • 104,019
  • 25
  • 176
  • 264
Serve Laurijssen
  • 9,266
  • 5
  • 45
  • 98
  • "but the loop will terminate so there is no UB." – nononono, that's the other way around. **Because** there's no UB, **therefore** the compiler must emit code that makes the loop terminate. The reason for the behavior being defined, in turn, is that the decremented object (`i`) is `unsigned`. – The Paramagnetic Croissant Apr 10 '16 at 05:53
1

Your code is well-behaved and contains no undefined behavior for any value of N ≥ 0.

Let us focus on the while-loop, and trace through an execution example with N = 2.

// Initialization
#define N 2U
unsigned int i = N;  // i = 2

// First iteration
while(i--)  // condition = i = 2 = true; i post-decremented to 1
    s += a[i];  // i = 1 is in bounds

// Second iteration
while(i--)  // condition = i = 1 = true; i post-decremented to 0
    s += a[i];  // i = 0 is in bounds

// Third iteration
while(i--)  // condition = i = 0 = false; i post-decremented to 0xFFFFFFFF
// Loop terminated

We can see from this trace that a[i] is always in bounds, and i experiences an unsigned wraparound when decrementing from 0, which is perfectly well-defined.

To answer your second question, if we changed the type of i to signed int, the only behavior that changes in the example trace is that the loop terminates at the same place but i gets decremented to -1, which is also perfectly well-defined.

Thus in conclusion, your code is well-behaved assuming that N ≥ 0, whether i is unsigned int or signed int. (If N is negative and i is signed int, then the loop will keep decrementing until undefined behavior happens at INT_MIN.)

Nayuki
  • 17,911
  • 6
  • 53
  • 80
0

Does while loop contain undefined behavior because of integer underflow?

First off this is a Boolean Conversion:

A prvalue of integral, floating-point, unscoped enumeration, pointer, and pointer-to-member types can be converted to a prvalue of type bool.

The value zero (for integral, floating-point, and unscoped enumeration) and the null pointer and the null pointer-to-member values become false. All other values become true.

So there will be no integer underflow if i is properly initialized, it will reach 0U and that will be cast to a false value.

So what the compiler is effectively doing here is: while(static_cast<bool>(i--))

Do compilers have right to assume that while loop condition is always true because of that and end up with endless loop?

The key reason this isn't an endless loop is that the postfix decrement operator returns the value of the decremented variable. It's defined as T operator--(T&, int) And as discussed in previously the compiler is going to evaluate whether that value is 0U as part of the conversion to bool.

What the compiler is effectively doing here is: while(0 != operator--(i, 1))

In your update you cite this code as a motivation for your question:

void fn(void)
{
    /* write something after this comment so that the program output is 10 */
    int a[1] = {0};
    int j = 0;
    while(a[j] != 5) ++j;  /* Search stack until you find 5 */
    a[j] = 10;             /* Overwrite it with 10 */
    /* write something before this comment */
}

Upon inspection, that entire program has undefined behavior there is only 1 element of a and it's initialized to 0. So for any index other than 0, a[j] is looking off the end of the array. The will continue till a 5 is found or until the OS errors because the program has read from protected memory. This is unlike your loops condition which will exit when the postfix decrement operator returns a value of 0, so there can't be an assumption that this is always true, or that the loop will go on infinitely.

What if i is signed int?

Both of the above conversions are built-in to C++. And they are both defined for all integral types, signed and unsigned. So ultimately this expands to:

while(static_cast<bool>(operator--(i, 1)))

Doesn't it contain pitfalls related to array access?

Again this is calling a built-in operator, the subscript operator: T& operator[](T*, std::ptrdiff_t)

So a[i] is the equivalent of calling operator(a, static_cast<ptrdiff_t>(i))

So the obvious followup question is what's a ptrdiff_t? It's a implementation defined integer, but as such each implementation of the standard is responsible for defining conversions to and from this type, so i will cast correctly.

Jonathan Mee
  • 37,899
  • 23
  • 129
  • 288
  • What does `operator--` returning a reference have to do with anything? And how does `operator++` come into it? Yours, confused. – Lightness Races in Orbit Apr 08 '16 at 11:48
  • @LightnessRacesinOrbit Thanks. Typed the wrong thing on the `operator++`. But what do you mean, "What does `operator--` returning a reference have to do with anything?" If that doesn't return something the `while`-condition has nothing to evaluate? – Jonathan Mee Apr 08 '16 at 11:50
  • 1
    It could return by value instead... and, in fact, it does. Postfix decrement _has to_ (and prefix decrement for the built-in arithmetic types does anyway, per the reference you linked to). Besides, it's obvious that `i--` evaluates to something otherwise the code wouldn't compile. I think that whole section is a red herring for this question. – Lightness Races in Orbit Apr 08 '16 at 11:59
  • I also consider this answer introduces unnecessary red herrings for this question. For basic types, expressions like `i++` are not translated anything like `operator++(i, 1)`. That sort of translation only happens for user-defined types (structs, classes, etc) for which a user-defined `operator++()` is provided. The standard also forbids overloading functions like `operator++()` to act on user-defined types. – Peter Apr 08 '16 at 12:04
  • @LightnessRacesinOrbit I thought the return was important to point out. I did not realize that assignment operators returned a value when I first started programming, so I thought that might be helpful, and as such I'd like to keep it. I've tried to edit it to make it more clear, though perhaps I could do better? – Jonathan Mee Apr 08 '16 at 12:04
  • Well removing the factual inaccuracies would be a good start. :) Again, there are no references here. (And decrement is not an assignment operator) – Lightness Races in Orbit Apr 08 '16 at 12:05
  • @Peter Ummm... I gave a reference on the built-in [`postfix decrement operator`](http://en.cppreference.com/w/cpp/language/operator_incdec#Built-in_postfix_operators): "For every optionally volatile-qualified arithmetic type A other than `bool`, and for every optionally volatile-qualified pointer P to optionally cv-qualified object type, the following function signatures participate in overload resolution: `A& operator--(A&, int)`" Are you suggesting that reference is incorrect? – Jonathan Mee Apr 08 '16 at 12:08
  • @LightnessRacesinOrbit I'm uncertain which factual inaccuracies you refer to, but I assure you I'd like to correct them. I have 4 reference links in the answer, perhaps I should make those more clear instead of in-lining them? And `i++` is the equivalent of `i += 1` how can you say it is not an assignment operator? – Jonathan Mee Apr 08 '16 at 12:11
  • 1
    I described them already: there are no references here, yet you talk about references. Meanwhile, [`i++` is certainly _not_ equivalent to `i += 1`](http://coliru.stacked-crooked.com/a/fe58096ddecd26dc) (though `++i` is); it's close but, anyway, the standard does not include either of them in the category "assignment operators". See §5.3 and §5.17. – Lightness Races in Orbit Apr 08 '16 at 12:12
  • 1
    _"Are you suggesting that reference is incorrect?"_ No, you're misreading and misquoting it. You keep writing `A&` instead of `A`. How could a postfix increment ever return by reference? – Lightness Races in Orbit Apr 08 '16 at 12:17
  • @LightnessRacesinOrbit Ah, excellent comment, I'm sorry I missed that one... I just typed this from memory... can you help my understand why it has to return by value? Anyway my reference was to: http://en.cppreference.com/w/cpp/language/operator_incdec#Built-in_postfix_operators – Jonathan Mee Apr 08 '16 at 12:19
  • @Jonathan - you're quoting that reference out of context. – Peter Apr 08 '16 at 12:19
  • @Peter How could I be quoting that out of context, It's referring to: "For every optionally volatile-qualified arithmetic type A other than `bool`", this has nothing to do with structs or classes, that's the build in operation that's called. Do you disagree? – Jonathan Mee Apr 08 '16 at 12:24
  • @LightnessRacesinOrbit Ah, found a reference on why it has to return by value, which makes so much sense, I don't see why that wasn't clear to me before: http://stackoverflow.com/a/14825829/2642059 – Jonathan Mee Apr 08 '16 at 12:36
  • @Peter Perhaps you missed my comment, could you clarify why you think translating "arithmetic type `A` other than `bool`" to mean `int` is taking anything out of context? – Jonathan Mee Apr 11 '16 at 01:47