The undefined behavior arises because the variable i
is modified more than once between two sequence points. Sequence points are points after which all side effects of previous evaluations are visible, but no future side effects are visible. The standard states:
Between the previous and next sequence point an object shall have
its stored value modified at most once by the evaluation of an
expression. Furthermore, the prior value shall be read only to
determine the value to be stored.
So, what are the side effects that we are concerned about?
++i
, which assigns to i the value i+1
i = ++i
, which assigns to i the value of the expression ++i
, which is i+1
So, we are going to get two (admittedly, equivalent) side effects: assigning i+1
to the variable i
. What we're concerned about is, between which two sequence points do these side effects occur?
What operations constitute sequence points? There are multiple, but there is only one that is actually relevant here:
- at the end of a full expression (in this case,
i = ++i
is a full expression)
Namely, the pre-increment ++i
is not a sequence point. Which means that both side effects (the increment, and the assignment) will occur between the same two sequence points, modifying the same variable i
. Thus, it is undefined behavior; the fact that both modifications happen to have the same value is inconsequential.
But why is it bad to modify a variable multiple times between sequence points? To prevent things like:
i = ++i + 1;
Here, i
is incremented, but then it is also assigned the value (i+1) + 1
, due to the semantics of the pre-increment. Since the side effects have an ambiguous ordering, the behavior is undefined.
Now, there could hypothetically be a special case made in the standard that multiple modification between two sequence points is OK as long as the values are the same, but this would likely needlessly complicate compiler implementations, without much benefit.