5

From Prasoon's answer to question regarding "Undefined Behavior and Sequence Points", I do not understand what the following means

.. the prior value shall be accessed only to determine the value to be stored.

As examples, the following are cited to possess Undefined Behaviour in C++:

  1. a[i] = i++;
  2. int x = i + i++;

Despite the explanations given there, I do not understand this part (I think I correctly understand the rest of the answer).


I do not understand what is wrong with the above code samples. I think these have well defined steps for the compiler as below.

a[i] = i++;

  • a[i] = i;
  • i = i + 1;

int x = i + i++ ;

  • x = i + i;
  • i = i + 1;

What am I missing? What does 'prior value shall be accessed only to determine the value to be stored' mean?

Community
  • 1
  • 1
Lazer
  • 90,700
  • 113
  • 281
  • 364

4 Answers4

5

See also this question and my answer to it. I'm not going to vote to close this as a duplicate because you're asking about C++ rather than C, but I believe the issue is the same in both languages.

the prior value shall be accessed only to determine the value to be stored.

This does seem like an odd requirement; why should the standard care why a value is accessed? It makes sense when you realize that if the prior value is read to determine the value to be stored in the same object, that implicitly imposes an ordering on the two operations, so the read has to happen before the write. Because of that ordering, the two accesses to the same object (one read and one write) are safe. The compiler cannot rearrange (optimize) the code in a way that causes them to interfere with each other.

On the other hand, in an expression like

a[i] = i++

there are three accesses to i: a read on the left hand side to determine which element of a is to be modified, a read on the right hand side to determine the value to be incremented, and a write that stores the incremented value back in i. The read and write on the RHS are ok (i++ by itself is safe), but there's no defined ordering between the read on the LHS and the write on the RHS. So the compiler is free to rearrange the code in ways that change the relationship between those read and write operations, and the standard figuratively throws up its hands and leaves the behavior undefined, saying nothing about the possible consequences.

Both C11 and C++11 change the wording in this area, making some ordering requirements explicit. The "prior value" wording is no longer there. Quoting from a draft of the C++11 standard, 1.9p15:

Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced. [...] The value computations of the operands of an operator are sequenced before the value computation of the result of the operator. If a side effect on a scalar object is unsequenced relative to either anotherside effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.

Community
  • 1
  • 1
Keith Thompson
  • 254,901
  • 44
  • 429
  • 631
3
a[i] = i++;

i is modified. i is also read to determine which index of a to use, which does not affect the store to i. That's not allowed.

int x = i + i++;

i is modified. i is also used to calculate the value to store into x, which does not affect the store to i. That's not allowed.

1

Since the standard says that "the prior value shall be accessed only to determine the value to be stored", compilers are not required to follow the "well defined" steps you outlined.

And they often don't.

What the wording of the standard means for your particular examples is that the compiler is permitted to order the steps like so:

a[i] = i++;
  • i = i + 1;
  • a[i] = i;

int x = i + i++ ;

  • i = i + 1;
  • x = i + i;

Which give an entirely different outcome than your imagined well defined order. The compiler is also permitted to do whatever else it might like, even if it makes less sense to you than what I just typed above. That's what undefined behavior means.

Michael Burr
  • 333,147
  • 50
  • 533
  • 760
0

While a statement like x=y+z; is semantically equivalent to temp=y; temp+=z; x=temp; there's generally no requirement (unless x is volatile) for a compiler to implement it that way. It may on some platforms be much more efficiently performed as x=y; x+=z;. Unless a variable is volatile, the code a compiler generates for an assignment may write any sequence of values to it provided that:

  1. Any code which is entitled to read the "old" value of the variable acts upon the value it had before the assignment.

  2. Any code which is entitled to read the "new" value of the variable acts upon the final value it was given.

Given i=511; foo[i] = i++; a compiler would be entitled to write the value 5 to foo[511] or to foo[512], but would be no less entitled to store it to foo[256] or foo[767], or foo[24601], or anything else. Since the compiler would be entitled to store the value at any possible displacement from foo, and since the compiler would be entitled to do anything it likes with code that adds an overly large displacement to a pointer, those permissions together effectively mean that the compiler could do anything it likes with foo[i]=i++;.

Note that in theory, if i were a 16-bit unsigned int but foo was a 65536-element-or-larger array (entirely possible on the classic Macintosh), the above entitlements would allow a compiler given foo[i]=i++; to write to an arbitrary value of foo, but not do anything else. In practice, the Standard refrains from such fine distinctions. It's much easier to say that the Standard imposes no requirements on what compilers do when given expressions like foo[i]=i++; than to say that the compiler's behavior is constrained in some narrow corner cases but not in others.

supercat
  • 77,689
  • 9
  • 166
  • 211