23

Given the following program:

#include <stdio.h>
int main(void)
{
    int i = 1, j = 2;
    int val = (++i > ++j) ? ++i : ++j;
    printf("%d\n", val); // prints 4
    return 0;
}

The initialization of val seems like it could be hiding some undefined behavior, but I don't see any point at which an object is either modified more than once or modified and used without a sequence point in between. Could someone either correct or corroborate me on this?

machine_1
  • 4,266
  • 2
  • 21
  • 42
max1000001
  • 314
  • 1
  • 13
  • 5
    Is there a sequence point? Please see [this answer](https://stackoverflow.com/a/3575375/4142924) which states there is one *"Between the evaluations of the first operand of the conditional ?: operator and whichever of the second and third operands is evaluated (6.5.15)."* – Weather Vane Mar 14 '19 at 19:27
  • 1
    You got 4. What did you expect? – Bob Jarvis - Слава Україні Mar 14 '19 at 19:29
  • I expected 4. I don't think this code invokes UB, but I was told on another question that it does. Just wanted to eliminate any confusion as to whether that specific statement causes UB, and maybe get a better explanation that the one I provided in the question. – max1000001 Mar 14 '19 at 19:32
  • the ternary expression guarantees the sequence point. I cannot find any reference to `(++i > ++j)` though. Is `>` a sequence point? – Jean-François Fabre Mar 14 '19 at 19:38
  • @Jean-FrançoisFabre `i` and `j` will both be incremented somewhere between the previous sequence point and the comparison. – Weather Vane Mar 14 '19 at 19:40
  • 5
    @Jean-FrançoisFabre No, but it doesn't need to be. There's no need for a sequence point between two changes of two different variables. `++i > ++i` would be UB though. – sepp2k Mar 14 '19 at 19:41
  • Yes, but both `i` and `j` are only used once before the next sequence point occurs, so I don't believe there is UB there. – max1000001 Mar 14 '19 at 19:42
  • @sepp2k overlooked that. Thanks. I never understood why people use those operators in expressions so much. One slip and you get a nasty bug. – Jean-François Fabre Mar 14 '19 at 19:42
  • Regardless of whether or not it is UB, this is awful code. – KevinZ Mar 15 '19 at 03:49

4 Answers4

37

The behavior of this code is well defined.

The first expression in a conditional is guaranteed to be evaluated before either the second expression or the third expression, and only one of the second or third will be evaluated. This is described in section 6.5.15p4 of the C standard:

The first operand is evaluated; there is a sequence point between its evaluation and the evaluation of the second or third operand (whichever is evaluated). The second operand is evaluated only if the first compares unequal to 0; the third operand is evaluated only if the first compares equal to 0; the result is the value of the second or third operand (whichever is evaluated), converted to the type described below.

In the case of your expression:

int val = (++i > ++j) ? ++i : ++j;

++i > ++j is evaluated first. The incremented values of i and j are used in the comparison, so it becomes 2 > 3. The result is false, so then ++j is evaluated and ++i is not. So the (again) incremented value of j (i.e. 4) is then assigned to val.

dbush
  • 205,898
  • 23
  • 218
  • 273
8

too late, but maybe useful.

(++i > ++j) ? ++i : ++j;

In the document ISO/IEC 9899:201xAnnex C(informative)Sequence points we find that there is a sequence point

Between the evaluations of the first operand of the conditional ?: operator and whichever of the second and third operands is evaluated

In order to be well defined behavior one must not modify 2 times (via side-effects) the same object between 2 sequence points.

In your expression the only conflict that could appear would be between the first and second ++i or ++j.

At every sequence point the value last stored in the object shall agree with that prescribed by the abstract machine (this is what you would compute on paper, like on a turing machine).

Quote from 5.1.2.3p3 Program execution

The presence of a sequence point between the evaluation of expressions A and B implies that every value computation and side effect associated with A is sequenced before every value computation and side effect associated with B.

When you have side-effects in your code, they are sequenced by different expressions. The rule says that between 2 sequence points you can permute these expressions as you wish.

For example. i = i++. Because none of the operators involved in this expression represent sequence points, you can permute the expressions that are side-effects as you want. The C language allows you to use any of these sequences

i = i; i = i+1; or i = i+1; i=i; or tmp=i; i = i+1 ; i = tmp; or tmp=i; i = tmp; i = i+1; or anything that provides the same result as the abstract semantics of computation asks for interpretation of this computation. The Standard ISO9899 defines the C language as abstract semantics.

alinsoar
  • 15,386
  • 4
  • 57
  • 74
  • I think the part where you enumerate the modifications to `i` and `j` as "possible conflicts" adds something new and useful to the analysis. I hadn't thought of that, thanks! – max1000001 Mar 14 '19 at 20:46
5

There may be no UB in your program, but in the question: Does the statement int val = (++i > ++j) ? ++i : ++j; invoke undefined behavior?

The answer is yes. Either or both of the increment operations may overflow, since i and j are signed, in which case all bets are off.

Of course this doesn't happen in your full example because you've specified the values as small integers.

Doug Currie
  • 40,708
  • 1
  • 95
  • 119
  • 4
    I assure you that the question is not about signed integer overflow. It's about whether there is a sequence point between the first operand of the ternary operator and whichever wins from the second and the third operands. – machine_1 Mar 14 '19 at 21:48
  • 2
    The question was about „some undefined behavior“ and reminders about data types beeing implementation specific are totally appropriate for such an open question. And a signed integer Flow is UB. – eckes Mar 15 '19 at 00:15
  • @eckes but the question was “Does the …”, so an unconditional “yes” is a wrong answer. If the question was “Can the …” or “May the …”, the answer would be correct. – Holger Mar 15 '19 at 07:45
  • Was going to complain, but on a second thought +1 :) – Damon Mar 15 '19 at 12:54
0

I was going to comment on @Doug Currie that signed integer overflow was a tidbit too far fetched, although technically correct as answer. On the contrary!

On a second thought, I think Doug's answer is not only correct, but assuming a not entirely trivial three-liner as in the example (but a program with maybe a loop or such) should be extended to a clear, definite "yes". Here's why:

The compiler sees int i = 1, j = 2;, so it knows that ++i will be equal to j and thus cannot possibly be larger than j or even ++j. Modern optimizers see such trivial things.

Unless of course, one of them overflows. But the optimizer knows that this would be UB, and therefore assumes that, and optimizes according to, it will never happen.

So the ternary operator's condition is always-false (in this easy example certainly, but even if invoked repeatedly in a loop this would be the case!), and i will only ever be incremented once, whereas j will always be incremented twice. Thus not only is j always larger than i, it even gains at every iteration (until overflow happens, but this never happens per our assumption).

Thus, the optimizer is allowed to turn this into ++i; j += 2; unconditionally, which surely isn't what one would expect.

The same applies for e.g. a loop with unknown values of i and j, such as user-supplied input. The optimizer might very well recognize that the sequence of operations only depends on the initial values of i and j. Thus, the sequence of increments followed by a conditional move can be optimized by duplicating the loop, once for each case, and switching between the two with a single if(i>j). And then, while we're at it, it might fold the loop of repeated increment-by-twos into something like (j-i)<<1 which it just adds. Or something.
Under the assumption that overflow never happens -- which is the assumption that the optimizer is allowed to make, and does make -- such a modification which may completely changes the entire sense and mode of operation of the program is perfectly fine.

Try and debug that.

Damon
  • 67,688
  • 20
  • 135
  • 185