71

I have been fooling around with some code and saw something that I don't understand the "why" of.

int i = 6;
int j;

int *ptr = &i;
int *ptr1 = &j

j = i++;

//now j == 6 and i == 7. Straightforward.

What if you put the operator on the left side of the equals sign?

++ptr = ptr1;

is equivalent to

(ptr = ptr + 1) = ptr1; 

whereas

ptr++ = ptr1;

is equivalent to

ptr = ptr + 1 = ptr1;

The postfix runs a compilation error and I get it. You've got a constant "ptr + 1" on the left side of an assignment operator. Fair enough.

The prefix one compiles and WORKS in C++. Yes, I understand it's messy and you're dealing with unallocated memory, but it works and compiles. In C this does not compile, returning the same error as the postfix "lvalue required as left operand of assignment". This happens no matter how it's written, expanded out with two "=" operators or with the "++ptr" syntax.

What is the difference between how C handles such an assignment and how C++ handles it?

AnFi
  • 10,493
  • 3
  • 23
  • 47
Moe45673
  • 854
  • 10
  • 20
  • 12
    As far as I know `++i` doesn’t return an l-value in C. Regardless, this is UB as you modify the variable 2 times between two consecutive sequence points. In other words it’s unspecified whether the value is incremented first or it is assigned first. – bolov Sep 03 '14 at 22:04
  • 1
    @OllieFord I never understood UB to mean it might not compile. I mean, people say anything can happen, but I take it to mean anything can happen *when you run the code*. – juanchopanza Sep 03 '14 at 22:06
  • 3
    @juanchopanza the code runes, it is UB so the program goes back in time and stops the compiling process. So… yeah… – bolov Sep 03 '14 at 22:07
  • @juanchopanza Wouldn't an "ideal compiler" catch all UB, with `Error: UB`? – OJFord Sep 03 '14 at 22:07
  • 4
    @juanchopanza: Perhaps the program goes back in time and interrupts the compilation. Edit: I see bolov had the same idea – Benjamin Lindley Sep 03 '14 at 22:07
  • @OllieFord: I am using Eclipse in a Windows environment. Eclipse uses MinGW so, ostensibly, the respective compilers should behave somewhat similarly, no? – Moe45673 Sep 03 '14 at 22:11
  • @OllieFord An ideal compiler in a perfect world? Maybe. :) – user2719058 Sep 03 '14 at 22:13
  • @OllieFord A warning (or error) on invoking UB would be great, but UB is not always detectable. Left-shifting a negative integer is UB, but the compiler doesn't know what value a particular signed integer variable may hold. – cdhowie Sep 03 '14 at 22:13
  • 1
    @OllieFord _"Wouldn't an "ideal compiler" catch all UB, with Error: UB? "_ No, they can't. UB is in situations, where something is (or was forced e.g. by improper casting) to be syntactically correct, but has wrong conditions at runtime (e.g. dereferencing uninitialized pointers, etc.). [SCA tools](http://en.wikipedia.org/wiki/List_of_tools_for_static_code_analysis) provide such analysis for some cases (depends on their quality). – πάντα ῥεῖ Sep 03 '14 at 22:14
  • @bolov I get that C doesn't return an lvalue for that operator and perhaps that's the big issue. But why does the compilation error still happen when written expanded out with brackets and two "=" operators? – Moe45673 Sep 03 '14 at 22:15
  • @πάνταῥεῖ "Wouldn't", not "couldn't" – OJFord Sep 03 '14 at 22:15
  • 1
    @OllieFord Well an ideal compiler still can't determine exactly what will happen at runtime, which is where UB lives. Some UB could be detected at compile-time (a warning then would be great!) but the large majority simply cannot be. – cdhowie Sep 03 '14 at 22:17
  • @Moe45673: If `++ptr` is not an l-value, why would you expect `(ptr = ptr + 1)` to be? – Benjamin Lindley Sep 03 '14 at 22:17
  • 2
    The result of assignment is an rvalue in C and an lvalue in C++ (and `++x` is nothing more than `x += 1`). – T.C. Sep 03 '14 at 22:17
  • @OllieFord The subjunctive doesn't help. It's not possible to build such _ideal compiler_ that catches up all these situations. Integration of automatic SCA might be solution for specific toolchains. – πάντα ῥεῖ Sep 03 '14 at 22:18
  • @juanchopanza they tried... But when they tried this question haven't created yet, so they failed – Slava Sep 03 '14 at 22:18
  • @IwillnotexistIdonotexist The question is why doesn't the prefix version compile in C while id does in C++ (being more experienced with C++, I expected that to compile). – juanchopanza Sep 03 '14 at 22:19
  • 3
    @bolov I think `++ptr = ptr1` is not UB in C++ (>= 11). There is a sequenced-before relationship between the side-effect of the prefix `++` and the side-effect of `=`. – dyp Sep 03 '14 at 22:20
  • @dyp I never fully got how `=` works in C++11 in respect to sequencing . – bolov Sep 03 '14 at 22:22
  • @juanchopanza Interesting, I tried myself and it does compile in C++. Myself I can only speak for C, and indeed it doesn't compile either way in C. – Iwillnotexist Idonotexist Sep 03 '14 at 22:25
  • @bolov Yeah, the sequencing rules are confusing sometimes. For `=`, it's still rather simple: the left and right hand side are computed, THEN the side-effect happens, THEN the value of the assignment-expression is computed. The (value) computation of the lhs and rhs are unsequenced. The value computation of the lhs in this case requires the side-effect of `++` to have happended. – dyp Sep 03 '14 at 22:25
  • @IwillnotexistIdonotexist Yes, it is an lvalue in C++. I'm not sure what the reason for it not to be one in C could be. – juanchopanza Sep 03 '14 at 22:27
  • 1
    C99 rationale, 6.5.2.4 "The C89 Committee did not endorse the practice in some implementations of considering post- increment and post-decrement operator expressions to be lvalues." The same applies (by reference) to prefix increment/decrement. – dyp Sep 03 '14 at 22:29
  • @dyp it is indeed not UB, the logic used by defect report 637 really helps in making an easy to read explanation of why. These proofs are always hard to make understandable and 637 is one of the best I have read. – Shafik Yaghmour Sep 04 '14 at 13:01
  • @bolov you may find the defect report 637 which I quote in my answer very helpful in your understanding, it was one of the most helpful proofs I have read. – Shafik Yaghmour Sep 04 '14 at 13:02
  • 1
    @OllieFord please see [A C++ implementation that detects undefined behavior?](http://stackoverflow.com/q/7237963/1708801), catching all UB would not be possible as some answers explain. John Regehr whom I link to in my answer has some of the best articles on this and related topics. – Shafik Yaghmour Sep 04 '14 at 13:14
  • @cdhowie presumably `clang` runs its [undefined behavior sanitizer](http://clang.llvm.org/docs/UsersManual.html#controlling-code-generation) at runtime so it can catch more cases. – Shafik Yaghmour Sep 04 '14 at 18:48
  • @ShafikYaghmour It does have a mode that will emit extra code to catch some undefined behaviors and emit signals when this happens, and in fact it does have some compile-time UB checks, too. But I doubt it will catch *everything*. Of course, something is better than nothing. – cdhowie Sep 04 '14 at 18:50

2 Answers2

74

In both C and C++, the result of x++ is an rvalue, so you can't assign to it.

In C, ++x is equivalent to x += 1 (C standard §6.5.3.1/p2; all C standard cites are to WG14 N1570). In C++, ++x is equivalent to x += 1 if x is not a bool (C++ standard §5.3.2 [expr.pre.incr]/p1; all C++ standard cites are to WG21 N3936).

In C, the result of an assignment expression is an rvalue (C standard §6.5.16/p3):

An assignment operator stores a value in the object designated by the left operand. An assignment expression has the value of the left operand after the assignment, but is not an lvalue.

Because it's not an lvalue, you can't assign to it: (C standard §6.5.16/p2 - note that this is a constraint)

An assignment operator shall have a modifiable lvalue as its left operand.

In C++, the result of an assignment expression is an lvalue (C++ standard §5.17 [expr.ass]/p1):

The assignment operator (=) and the compound assignment operators all group right-to-left. All require a modifiable lvalue as their left operand and return an lvalue referring to the left operand.

So ++ptr = ptr1; is a diagnosable constraint violation in C, but does not violate any diagnosable rule in C++.

However, pre-C++11, ++ptr = ptr1; has undefined behavior, as it modifies ptr twice between two adjacent sequence points.

In C++11, the behavior of ++ptr = ptr1 becomes well defined. It's clearer if we rewrite it as

(ptr += 1) = ptr1;

Since C++11, the C++ standard provides that (§5.17 [expr.ass]/p1)

In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression. With respect to an indeterminately-sequenced function call, the operation of a compound assignment is a single evaluation.

So the assignment performed by the = is sequenced after the value computation of ptr += 1 and ptr1. The assignment performed by the += is sequenced before the value computation of ptr += 1, and all value computations required by the += are necessarily sequenced before that assignment. Thus, the sequencing here is well-defined and there is no undefined behavior.

T.C.
  • 133,968
  • 17
  • 288
  • 421
  • 1
    In your last quote is "the assignment" supposed to mean "the side-effect of the assignment" ? – M.M Sep 04 '14 at 03:53
  • Actually I don't understand why in C the value of a (non compound) assignment expression is said to be that of its left operand; it is actually the value of its **right** operand (and then obviously it is an rvalue). Sure, in cases like `i=i+1` it is not the value the right operand expression would get if it were evaluated again, but a similar statement is not true for the left operand expression either, in somewhat contrived cases like `int a[2]={0,3}; a[a[0]]=1`. – Marc van Leeuwen Sep 04 '14 at 13:21
  • 1
    There's one more point: every object in C++ is equivalent to an array of one object — itself — and the one-past-the-end value of any array is a valid pointer value. So there's no "messy … unallocated memory." – Potatoswatter Sep 04 '14 at 13:55
  • @Potatoswatter good point, added a note to my answer covering that. – Shafik Yaghmour Sep 04 '14 at 16:57
  • 1
    `++ptr = ptr1;` is grammatically correct in both languages (the left hand expression can be an *unary-expression* in C); Or what did you mean by syntactical correctness? – Columbo Feb 20 '15 at 14:09
  • @T.C. "Diagnosable" implies that the violation does not require a diagnostic, but it requires at least one diagnostic as per 5.1.1.3/1, doesn't it? I'm solely asking because I saw the word "Diagnosable" a couple of times in such contexts, and it seems that it's not the right word to use. – Columbo Feb 20 '15 at 14:34
  • @Columbo Well, C doesn't use this word normatively; C++ does, and it actually means "rules whose violation must be diagnosed" (see [intro.compliance]/p1). – T.C. Feb 20 '15 at 14:40
  • "The post-C++11 C++ standard" Hmmm, I think you mean "The C++11 C++ standard". (Interesting issue about English words pre/post in question about prefix/postfix.) – chux - Reinstate Monica Feb 20 '15 at 15:37
  • @chux Better? (I wanted to also include C++14.) – T.C. Feb 20 '15 at 15:39
  • 2
    @MarcvanLeeuwen: Here's an example where there's a big difference, and clearly **the value of the assignment expression is not the value of its right operand**: http://rextester.com/CIWA70704 – Ben Voigt Feb 20 '15 at 16:16
17

In C the result of pre and post increment are rvalues and we can not assign to an rvalue, we need an lvalue(also see: Understanding lvalues and rvalues in C and C++) . We can see by going to the draft C11 standard section 6.5.2.4 Postfix increment and decrement operators which says (emphasis mine going forward):

The result of the postfix ++ operator is the value of the operand. [...] See the discussions of additive operators and compound assignment for information on constraints, types, and conversions and the effects of operations on pointers. [...]

So the result of post-increment is a value which is synonymous for rvalue and we can confirm this by going to section 6.5.16 Assignment operators which the paragraph above points us to for further understanding of constraints and results, it says:

[...] An assignment expression has the value of the left operand after the assignment, but is not an lvalue.[...]

which further confirms the result of post-increment is not an lvalue.

For pre-increment we can see from section 6.5.3.1 Prefix increment and decrement operators which says:

[...]See the discussions of additive operators and compound assignment for information on constraints, types, side effects, and conversions and the effects of operations on pointers.

also points back to 6.5.16 like post-increment does and therefore the result of pre-increment in C is also not an lvalue.

In C++ post-increment is also an rvalue, more specifically a prvalue we can confirm this by going to section 5.2.6 Increment and decrement which says:

[...]The result is a prvalue. The type of the result is the cv-unqualified version of the type of the operand[...]

With respect to pre-increment C and C++ differ. In C the result is an rvalue while in C++ the result is a lvalue which explains why ++ptr = ptr1; works in C++ but not C.

For C++ this is covered in section 5.3.2 Increment and decrement which says:

[...]The result is the updated operand; it is an lvalue, and it is a bit-field if the operand is a bit-field.[...]

To understand whether:

++ptr = ptr1;

is well defined or not in C++ we need two different approaches one for pre C++11 and one for C++11.

Pre C++11 this expression invokes undefined behavior, since it is modifying the object more than once within the same sequence point. We can see this by going to a Pre C++11 draft standard section 5 Expressions which says:

Except where noted, the order of evaluation of operands of individual operators and subexpressions of individual expressions, and the order in which side effects take place, is unspecified.57) Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored. The requirements of this paragraph shall be met for each allowable ordering of the subexpressions of a full expression; otherwise the behavior is undefined. [ Example:

 i = v[i ++]; / / the behavior is undefined
 i = 7 , i++ , i ++; / / i becomes 9
 i = ++ i + 1; / / the behavior is undefined
 i = i + 1; / / the value of i is incremented

—end example ]

We are incrementing ptr and then subsequently assigning to it, which is two modifications and in this case the sequence point occurs at the end of the expression after the ;.

For C+11, we should go to defect report 637: Sequencing rules and example disagree which was the defect report that resulted in:

i = ++i + 1;

becoming well defined behavior in C++11 whereas prior to C++11 this was undefined behavior. The explanation in this report is one of best I have even seen and reading it many times was enlightening and helped me understand many concepts in a new light.

The logic that lead to this expression becoming well defined behavior goes as follows:

  1. The assignment side-effect is required to be sequenced after the value computations of both its LHS and RHS (5.17 [expr.ass] paragraph 1).

  2. The LHS (i) is an lvalue, so its value computation involves computing the address of i.

  3. In order to value-compute the RHS (++i + 1), it is necessary to first value-compute the lvalue expression ++i and then do an lvalue-to-rvalue conversion on the result. This guarantees that the incrementation side-effect is sequenced before the computation of the addition operation, which in turn is sequenced before the assignment side effect. In other words, it yields a well-defined order and final value for this expression.

The logic is somewhat similar for:

++ptr = ptr1;
  1. The value computations of the LHS and RHS are sequenced before the assignment side-effect.

  2. The RHS is an lvalue, so its value computation involves computing the address of ptr1.

  3. In order to value-compute the LHS (++ptr), it is necessary to first value-compute the lvalue expression ++ptr and then do an lvalue-to-rvalue conversion on the result. This guarantees that the incrementation side-effect is sequenced before the assignment side effect. In other words, it yields a well-defined order and final value for this expression.

Note

The OP said:

Yes, I understand it's messy and you're dealing with unallocated memory, but it works and compiles.

Pointers to non-array objects are considered arrays of size one for additive operators, I am going to quote the draft C++ standard but C11 has almost the exact same text. From section 5.7 Additive operators:

For the purposes of these operators, a pointer to a nonarray object behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type.

and further tells us pointing one past the end of an array is valid as long as you don't dereference the pointer:

[...]If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.

so:

++ptr ;

is still a valid pointer.

Shafik Yaghmour
  • 154,301
  • 39
  • 440
  • 740