19

Sorry for opening this topic again, but thinking about this topic itself has started giving me an Undefined Behavior. Want to move into the zone of well-defined behavior.

Given

int i = 0;
int v[10];
i = ++i;     //Expr1
i = i++;     //Expr2
++ ++i;      //Expr3
i = v[i++];  //Expr4

I think of the above expressions (in that order) as

operator=(i, operator++(i))    ; //Expr1 equivalent
operator=(i, operator++(i, 0)) ; //Expr2 equivalent
operator++(operator++(i))      ; //Expr3 equivalent
operator=(i, operator[](operator++(i, 0)); //Expr4 equivalent

Now coming to behaviors here are the important quotes from C++ 0x.

$1.9/12- "Evaluation of an expression (or a sub-expression) in general includes both value computations (including determining the identity of an object for lvalue evaluation and fetchinga value previously assigned to an object for rvalue evaluation) and initiation of side effects."

$1.9/15- "If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined."

[ Note: Value computations and side effects associated with different argument expressions are unsequenced. —end note ]

$3.9/9- "Arithmetic types (3.9.1), enumeration types, pointer types, pointer to member types (3.9.2), std::nullptr_t, and cv-qualified versions of these types (3.9.3) are collectively called scalar types."

  • In Expr1, the evaluation of the expression i (first argument), is unsequenced with respect to the evaluation of the expession operator++(i) (which has a side effect).

    Hence Expr1 has undefined behavior.

  • In Expr2, the evaluation of the expression i (first argument), is unsequenced with respect to the evaluation of the expession operator++(i, 0) (which has a side effect)'.

    Hence Expr2 has undefined behavior.

  • In Expr3, the evaluation of the lone argument operator++(i) is required to be complete before the outer operator++ is called.

    Hence Expr3 has well defined behavior.

  • In Expr4, the evaluation of the expression i (first argument) is unsequenced with respect to the evaluation of the operator[](operator++(i, 0) (which has a side effect).

    Hence Expr4 has undefined behavior.

Is this understanding correct?


P.S. The method of analyzing the expressions as in OP is not correct. This is because, as @Potatoswatter, notes - "clause 13.6 does not apply. See the disclaimer in 13.6/1, "These candidate functions participate in the operator overload resolution process as described in 13.3.1.2 and are used for no other purpose." They are just dummy declarations; no function-call semantics exist with respect to built-in operators."

Potatoswatter
  • 134,909
  • 25
  • 265
  • 421
Chubsdad
  • 24,777
  • 4
  • 73
  • 129
  • 1
    +!: Good question. I would keep an eye for the answers. – Arun Oct 04 '10 at 04:11
  • @Chubsdad : I agree with what @James McNellis said in his answer (which he deleted afterwards). All the 4 expressions invoke UB in C++0x [IMHO]. I think you should ask this question at csc++ (comp.std.c++). `:)` – Prasoon Saurav Oct 04 '10 at 08:15
  • @Prasoon Saurav: Why is Expr3 having undefined behavior? I thought this should be fine. gcc/comeau/llvm(demo) also all compile without any warning. – Chubsdad Oct 04 '10 at 08:29
  • Thats because the side effects associated with `++` [inner] and `++` [outer] are not sequenced relative to each other(although the value computations are sequenced). `:)` – Prasoon Saurav Oct 04 '10 at 08:36
  • Check out [this](http://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wsequence_002dpoint-288). It has been mentioned that `Some more complicated cases are not diagnosed by -Wsequence-point option, and it may give an occasional false positive result,.....`. – Prasoon Saurav Oct 04 '10 at 08:38
  • @Prasoon if you say that this is undefined behavior, you will have to come up with wording of C++0x (currently n3126) that supports the point. Just quoting James Kanze won't prove the point, mate. – Johannes Schaub - litb Oct 05 '10 at 04:54
  • Both me and "Kai-Uwe Bux" have shown how the definedness follows from various C++0x rules. The intent may or may not be what the wording reflects, but that's an entire different story. – Johannes Schaub - litb Oct 05 '10 at 04:59
  • Johannes Schaub - litb: While you are at it, please tell us if this is the right way to visualize about these expressions or do I miss any case with such thinking (in terms of operator function call for native types even if these do not exist in practicality) except as in $13.6/18 – Chubsdad Oct 05 '10 at 05:21
  • For example this thinking also explains ++i = 0 (operator=(operator++(i), 0) as well-defined behavior. – Chubsdad Oct 05 '10 at 05:31

2 Answers2

16

Native operator expressions are not equivalent to overloaded operator expressions. There is a sequence point at the binding of values to function arguments, which makes the operator++() versions well-defined. But that doesn't exist for the native-type case.

In all four cases, i changes twice within the full-expression. Since no ,, ||, or && appear in the expressions, that's instant UB.

§5/4:

Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression.

Edit for C++0x (updated)

§1.9/15:

The value computations of the operands of an operator are sequenced before the value computation of the result of the operator. If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.

Note however that a value computation and a side effect are two distinct things. If ++i is equivalent to i = i+1, then + is the value computation and = is the side effect. From 1.9/12:

Evaluation of an expression (or a sub-expression) in general includes both value computations (including determining the identity of an object for glvalue evaluation and fetching a value previously assigned to an object for prvalue evaluation) and initiation of side effects.

So although the value computations are more strongly sequenced in C++0x than C++03, the side effects are not. Two side effects in the same expression, unless otherwise sequenced, produce UB.

Value computations are ordered by their data dependencies anyway and, side effects absent, their order of evaluation is unobservable, so I'm not sure why C++0x goes to the trouble of saying anything, but that just means I need to read more of the papers by Boehm and friends wrote.

Edit #3:

Thanks Johannes for coping with my laziness to type "sequenced" into my PDF reader search bar. I was going to bed and getting up on the last two edits anyway… right ;v) .

§5.17/1 defining the assignment operators says

In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression.

Also §5.3.2/1 on the preincrement operator says

If x is not of type bool, the expression ++x is equivalent to x+=1 [Note: see … addition (5.7) and assignment operators (5.17) …].

By this identity, ++ ++ x is shorthand for (x +=1) +=1. So, let's interpret that.

  • Evaluate the 1 on the far RHS and descend into the parens.
  • Evaluate the inner 1 and the value (prvalue) and address (glvalue) of x.
  • Now we need the value of the += subexpression.
    • We're done with the value computations for that subexpression.
    • The assignment side effect must be sequenced before the value of assignment is available!
  • Assign the new value to x, which is identical to the glvalue and prvalue result of the subexpression.
  • We're out of the woods now. The whole expression has now been reduced to x +=1.

So, then 1 and 3 are well-defined and 2 and 4 are undefined behavior, which you would expect.

The only other surprise I found by searching for "sequenced" in N3126 was 5.3.4/16, where the implementation is allowed to call operator new before evaluating constructor arguments. That's cool.

Edit #4: (Oh, what a tangled web we weave)

Johannes notes again that in i == ++i; the glvalue (a.k.a. the address) of i is ambiguously dependent on ++i. The glvalue is certainly a value of i, but I don't think 1.9/15 is intended to include it for the simple reason that the glvalue of a named object is constant, and cannot actually have dependencies.

For an informative strawman, consider

( i % 2? i : j ) = ++ i; // certainly undefined

Here, the glvalue of the LHS of = is dependent on a side-effect on the prvalue of i. The address of i is not in question; the outcome of the ?: is.

Perhaps a good counterexample is

int i = 3, &j = i;
j = ++ i;

Here j has a glvalue distinct from (but identical to) i. Is this well-defined, yet i = ++i is not? This represents a trivial transformation that a compiler could apply to any case.

1.9/15 should say

If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the prvalue of the same scalar object, the behavior is undefined.

Potatoswatter
  • 134,909
  • 25
  • 265
  • 421
  • Sorry, I mentioned C++0X a little bit late in my post – Chubsdad Oct 04 '10 at 04:14
  • @Potato AFAIK it's just the term "sequence point" that's been deprecated in favor of more clear wording, but it's still there. – NullUserException Oct 04 '10 at 04:23
  • @NullUser: There's a concept of sequencing, but the C way of saying the machine is either in a fully-determined state or indeterminate is gone. – Potatoswatter Oct 04 '10 at 04:25
  • I also thought so. But http://stackoverflow.com/questions/3850040/complicated-expression-involving-logical-and/3850092#3850092 brought in some changes to the thought process. Confusion is now on 'i = ++i;' and I have not been able to get my thinking straight on this one. With this post I am trying to verify if my thought process is right (thinking in terms of operator equivalent). – Chubsdad Oct 04 '10 at 07:48
  • 2
    @Potatoswatter : All four expressions [invoke UB in C++0x](http://groups.google.com/group/comp.lang.c++/browse_thread/thread/97e9767139bebcc6) – Prasoon Saurav Oct 04 '10 at 07:53
  • @Chubsdad: The value of `++i` is an operand to the `=` operator, so at least according to this paragraph, the effects is sequenced. Whether there something else makes it UB, I will check Prasoon's link next. – Potatoswatter Oct 04 '10 at 08:38
  • @Potatoswatter: "Native operator expressions are not equivalent to overloaded operator expressions" Is this Standardese? Are you talking here about a difference in sequence point location at the binding of values to function arguments between a built-in/native/default operator= and an overloaded operator= defined as above in user code? Is it possible to give an example of the "native-type case"? So if I have defined an overloaded++ and write i++ it's well defined behavior - otherwise undefined? And this applies to C++98/03 and C++0x? – Peter McG Oct 04 '10 at 08:40
  • @Prasoon: No, those unsequenced examples use postincrement. Johannes gives a preincrement example `i = v[++i]` and argues that the side effect is of storing `(i=i+1)` is unsequenced relative to the next, explicit, assignment… that is another argument, and perhaps a good one. But I'm too sleepy to independently evaluate it for now. – Potatoswatter Oct 04 '10 at 08:43
  • @Peter: The beginning of the clause on overloaded operators and the beginning of the clause on expressions both discuss the difference in sequence points. For example, `&&`, `||`, and `,` are sequenced quite differently. Overloaded operators are function calls, Clause 5 operators are not. (Edit: except the function call operator. I go to bed now.) – Potatoswatter Oct 04 '10 at 08:45
  • 1
    @Potatoswatter : I am not very sure but still think `++ ++i` and `i = ++i` are both UB in C++0x. Read James Kanze's posts [at the end of that discussion]. `;)` – Prasoon Saurav Oct 04 '10 at 08:50
  • @Prasoon: Yep, reading Usenet rants is a chore but §1.9/12 and 15 really are pretty clear about it. Updated answer. – Potatoswatter Oct 04 '10 at 15:16
  • @Prasoon, the discussion you link to figures that expression 3 does *not* invoke undefined behavior in C++0x according to the draft. Even "James Kanze" agrees to that obvious thing after being told so one-hundred-and-one times (he doubts that "assign" is a side-effect... but if it isn't a modification, *what* is it?). James Kanze oviously does not listen to what one writes. Notice how I write in the middle of that discussion how "sequenced-before" is a transitive relation. He just ignores that, and finally when someone else mentions that, he says "Ohhh, you may have a point there, captain!". – Johannes Schaub - litb Oct 05 '10 at 04:48
  • The proof for the well-definedness of Expr3 and for undefinedness of Expr3 can be found here: http://stackoverflow.com/questions/3690141/multiple-preincrement-operations-on-a-variable-in-cc/3691469#3691469 – Johannes Schaub - litb Oct 05 '10 at 05:02
  • @Johannes: I think you mean undefinedness of Expr1… anyway I don't see how you figure the side effect of the first `++` in Expr3 is sequenced differently from the `=` in Expr1. The two `++` operations generate two distinct assignments to two different values and 1.9/15 does not sequence them. – Potatoswatter Oct 05 '10 at 05:18
  • @Potatoswatter yes, i meant undefinedness of Expr1, sorry :) Have to head to work. In the meantime, look into the paragraph for assignment in 5.x, where it's all sequenced. Seeya :) – Johannes Schaub - litb Oct 05 '10 at 05:25
  • we are back full circle. Why is 1 having well-defined behavior? That's what was proved to be wrong in the usenet discussion isn't it? – Chubsdad Oct 05 '10 at 09:02
  • @Chubdad: because the `++i` is a subexpression equivalent to an assignment expression (5.3.2/1) whose side-effect is ordered "before the value computation of the assignment expression." (5.17/1) – Potatoswatter Oct 05 '10 at 11:52
  • That's what I also felt initially. But the usenet discussion forum seems to conclude that behavior of expr1 is ill-formed. @litb also argues for that – Chubsdad Oct 05 '10 at 14:31
  • @Potatoswatter like @Chubsdad says, 1 is definitely undefined behavior. Yes, the side effect of `++i` is ordered before value computation of `++i`. This sequences the increment side effect before the "real" assignment side effect to `i`. But we also have a value computation of `i` on the left side in `i = ++i`, which is not sequenced relative to value computation of the right side. This is what makes it undefined. Note that "value computation" can not only mean "read a value" but means "compute what object an lvalue refers to" for glvalue evaluation. See the stackoverflow link above. – Johannes Schaub - litb Oct 05 '10 at 18:36
  • Bah, sorry for the churn, old on… – Potatoswatter Oct 05 '10 at 18:55
  • @Potatoswatter I see what you mean now. – Johannes Schaub - litb Oct 05 '10 at 18:58
  • @Johannes: sorry, my connection went down and I went away forgetting you were waiting… posted the update. – Potatoswatter Oct 05 '10 at 19:36
  • @Potatoswatter I removed my answer because you elaborated on yours to give a reasonable rationale. – Johannes Schaub - litb Oct 05 '10 at 19:48
  • @Johannes: I guess it's your call, I thought it was a good answer. Do you really think the working paper is still in that much flux? Do you think a DR has been submitted, and would you want to do so? Alisdair hasn't replied to my last few mails, so I think I might have annoyed him ;v) . – Potatoswatter Oct 05 '10 at 19:54
  • @Potatoswatter: j = ++ i; should also be undefined behavior. j is just an alias for 'i' isn't it.? I (captial I) fail to understand this and continue to exhibit undefined behavior...:) – Chubsdad Oct 06 '10 at 02:48
  • @Chubsdad: Even though it's an alias, its glvalue evaluation does not require a glvalue evaluation of `i`. Generally speaking, evaluating a reference does not require the original object to be on hand. There's no reason it should be UB, so it makes sense there should be an easy loophole or transformation to code which is not UB. – Potatoswatter Oct 06 '10 at 04:26
  • its glvalue evaluation does not require a glvalue evaluation of i. Generally speaking, evaluating a reference does not require the original object to be on hand : I really doubt this. Do you have any reference for this 'reference' :) – Chubsdad Oct 06 '10 at 04:49
  • @Chubsdad: Consider the function `f( int &i, int &j ) { j = ++ i; } … f( i, i );` Still think so? A reference is an alias because its glvalue evaluates identically, not because it syntactically refers to the original object. – Potatoswatter Oct 06 '10 at 05:11
  • @Potatoswatter: My understanding is 'j' is an alias for 'i' and even though 'i' is not in scope, 'j' can be used as long as 'i' remains a valid object it originally was. I am not so sure if I understand 'glvalue evaluates identically' vs 'glvalue is the same'. – Chubsdad Oct 06 '10 at 06:08
  • This is with reference to 5.5 - "If an expression initially has the type “reference to T” (8.3.2, 8.5.3), the type is adjusted to T prior to any further analysis. `The expression designates the object or function denoted by the reference`, and the expression is an lvalue or an xvalue, depending on the expression. – Chubsdad Oct 06 '10 at 06:17
  • @Chubsdad: It designates it by having an identical lvalue; that's the long and the short of it. Lvalue-to-rvalue conversion implements references having the value of the referent object. The reference doesn't tell the compiler to go look at the referenced variable and get its lvalue, because it might not know what variable is referenced. The compiler computes the lvalue of the reference and that lvalue identifies an object. If you want to debate this further, please open a new question. – Potatoswatter Oct 06 '10 at 06:46
  • @Potatoswatter: Your wish is my command :) (http://stackoverflow.com/questions/3870172/evaluation-of-a-reference-expression) – Chubsdad Oct 06 '10 at 10:52
  • I got it. But my doubt remains. Let's take 'i = ++i'. As per 13.6/18 it can be treated as 'operator=(i, operator++(i))'. The side effect on the scalar object ('i' due to the 2nd argument) is unsequenced relative to value computation of the same scalar object ('i' for the 1st argument). Hence the behavior should be undefined. Can you tell me why it should be well-defined (as you mentioned in your post) from this perspective of thought? – Chubsdad Oct 07 '10 at 05:01
  • @Chubsdad: By the standard, it is undefined, as Johannes explained. (I wish he hadn't deleted his answer.) However, it's next to impossible for a compiler to produce any behavior besides the desired, because the supposedly dependent value is a constant. Again, clause 13.6 does not apply. See the disclaimer in 13.6/1, "These candidate functions participate in the operator overload resolution process as described in 13.3.1.2 and are used for no other purpose." They are just dummy declarations; no function-call semantics exist with respect to built-in operators. – Potatoswatter Oct 07 '10 at 05:11
  • @Potatoswatter: So do you want to change your post as it says that Expr1 is well-formed? – Chubsdad Oct 07 '10 at 05:23
  • @Chubsdad: No, the last edit is spent entirely on discussion of Expr1, so the issue is clearly stated. I'm a bit tired of this. – Potatoswatter Oct 07 '10 at 05:31
  • @Chubsdad I think what @Potatoswatter refers to is that the Std says "using the value of the same scalar object". The value of an object is not used if you simply use an lvalue that refers to the object. You have to actually read the value, despite the weird term "value computation". But the term "value" in "value computation" does not seem to refer to the value of an object, but to the "value" of an expression. I.e the glvalue or prvalue result of an expression. But I really think the Standard should be clearer on this issue. – Johannes Schaub - litb Oct 08 '10 at 11:20
  • @Johannes Schaub - litb: Oh. Now I understand better. It was little too cryptic for me. I think I am starting to get a hang of the core issue but not there yet. – Chubsdad Oct 09 '10 at 02:56
  • 1
    I have found two DRs that support this answer: http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#637 and http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#222 – Johannes Schaub - litb Oct 10 '10 at 13:56
  • @PrasoonSaurav `All four expressions invoke UB in C++0x` you are mistaking, since `i = ++i` is perfectly defined in c++0x. http://stackoverflow.com/questions/17400137/order-of-evaluation-and-undefined-behaviour – Kolyunya Jul 01 '13 at 10:08
  • Thank you for this exhaustive answer. You mention operator `new` may be called prior of evaluation of constructor args. Is similar true for `delete`? Take for example https://godbolt.org/z/n1qzrWsdq. It's basically `delete a->b;` where deleting `b` triggers `delete a` and the compiler is accessing through `a` again after the pointed-to object is dead. Speculating; `delete` is not a function but an operator, thus the deletion of `a` is not sequenced against dereferencing `a`? – emacs drives me nuts Apr 25 '23 at 19:24
0

In thinking about expressions like those mentioned, I find it useful to imagine a machine where memory has interlocks so that reading a memory location as part of a read-modify-write sequence will cause any attempted read or write, other than the concluding write of the sequence, to be stalled until the sequence completes. Such a machine would hardly be an absurd concept; indeed, such a design could simplify many multi-threaded code scenarios. On the other hand, an expression like "x=y++;" could fail on such a machine if 'x' and 'y' were references to the same variable, and the compiler's generated code did something like read-and-lock reg1=y; reg2=reg1+1; write x=reg1; write-and-unlock y=reg2. That would be a very reasonable code sequence on processors where writing a newly-computed value would impose a pipeline delay, but the write to x would lock up the processor if y were aliased to the same variable.

supercat
  • 77,689
  • 9
  • 166
  • 211