32

In C++, does the following have undefined behaviour:

int i = 0;
(i+=10)+=10;

There was some debate about this in the comments to my answer to What's the result of += in C and C++? The subtlety here is that the default response seems to be "yes", whereas it appears that the correct answer is "it depends on the version of the C++ standard".

If it does depend on the version of the standard, please explain where it's UB and where it's not.

Community
  • 1
  • 1
NPE
  • 486,780
  • 108
  • 951
  • 1,012
  • 3
    Which C++, the current one or the old one? –  May 18 '12 at 15:23
  • 8
    @JohnDibling: I take it you think the answer is "yes". There's a pretty convincing argument in the comments to http://stackoverflow.com/a/10653994/367273 that the answer is, in fact, "no". – NPE May 18 '12 at 15:23
  • @Fanael raises a great point--which version of the C++ spec are you asking about? – Onorio Catenacci May 18 '12 at 15:24
  • 2
    @Fanael: Both. The more complete the answer, the better. – NPE May 18 '12 at 15:26
  • 4
    @OnorioCatenacci: buy him the spec, then we may talk start talking about downvoting. –  May 18 '12 at 15:31
  • D'oh. `s/talk //`, too late to edit now. –  May 18 '12 at 15:38
  • 9
    Arguably, this is a dupe of the FAQ question on sequence points, which itself should be updated to reflect C++11's new rules with sequence-before and -after. But I don't think I'm ready to argue that just yet, it might be better to mark the existing FAQ question clearly as C++03, and start all over again for C++11. – Steve Jessop May 18 '12 at 16:14
  • @SteveJessop: The complete answer to this question would probably involve some language that's specific to compound assignment operators (C++11 5.17.1). It's therefore not entirely clear whether a general page about sequence points would be the best place to address this specific query. But I totally agree with you about the existing FAQ page and C++11. – NPE May 18 '12 at 16:26
  • @SteveJessop: I think I'd favor starting over again for C++11. First, because the original question is specifically about sequence points, which simply don't exist (as such) in C++11. Second, because the rules have changed (considerably). The SRP applies to more than just code! – Jerry Coffin May 18 '12 at 17:49
  • +1 For the question (it never ceases to amaze me how subtle C++ can be) but... why would you ever want to write such a statement? E.g. C++ Coding Standards by Sutter & Alexandrescu (Item 6): Correctness, simplicity and clarity come first. – TemplateRex May 18 '12 at 20:38
  • @rhalbersma: This exact code came up in another question, and I wanted to make sure I understood whether it was permissible. – NPE May 18 '12 at 20:58
  • I understand, but what legitimate use cases could there be for such code? – TemplateRex May 18 '12 at 21:17
  • 2
    @steve the sequence point FAQ question has both a c++03 and c++11 answer. but I think that both of those answers (topvoted ones) are next to useless because they contain not much more than just standard quotes. to someone already familiar with the standard, it doesn't need an SO answer to explain this matter. and to anyone else, it needs *not that many standard quotes without explanation*. – Johannes Schaub - litb May 19 '12 at 17:23
  • @rhalbersma We all (unfortunately) know about the existence of such code in the wild and whenever we face it we need to know whether it's UB or not. – jorey May 20 '12 at 08:40

3 Answers3

34

tl;dr: The sequence of the modifications and reads performed in (i+=10)+=10 is well defined in both C++98 and C++11, however in C++98 this is not sufficient to make the behavior defined.

In C++98 multiple modifications to the same object without an intervening sequence-point results in undefined behavior, even when the order of those modifications is well specified. This expression does not contain any sequence points and so the fact that it consists of two modifications is sufficient to render its behavior undefined.

C++11 doesn't have sequence points and only requires that the modifications of an object be ordered with respect to each other and to reads of the same object to produce defined behavior.

Therefore the behavior is undefined in C++98 but well defined in C++11.


C++98

C++98 clause [expr] 5 p4

Except where noted, the order of evaluation of operands of individual operators and subexpressions of individual expression, and the order in which side effects take place, is unspecified.

C++98 clause [expr.ass] 5.17 p1

The result of the assignment operation is the value stored in the left operand after the assignment has taken place; the result is an lvalue

So I believe the order is specified, however I don't see that that alone is enough to create a sequence point in the middle of an expression. And continuing on with the quote of [expr] 5 p4:

Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression.

So even though the order is specified it appears to me that this is not sufficient for defined behavior in C++98.


C++11

C++11 does away sequence points for the much clearer idea of sequence-before and sequenced-after. The language from C++98 is replaced with

C++11 [intro.execution] 1.9 p15

Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced. [...]

If a side effect on a scalar object is unsequenced relative to either another side effect on the same scalar object or a value computation using the value of the same scalar object, the behavior is undefined.

C++11 [expr.ass] 5.17 p1

In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression.

So while being ordered was not sufficient to make the behavior defined in C++98, C++11 has changed the requirement such that being ordered (i.e., sequenced) is sufficient.

(And it seems to me that the extra flexibility afforded by 'sequence before' and 'sequenced after' has lead to a much more clear, consistent, and well specified language.)


It seems unlikely to me that any C++98 implementation would actually do anything surprising when the sequence of operations is well specified even if that is insufficient to produce technically well defined behavior. As an example, the internal representation of this expression produced by Clang in C++98 mode has well defined behavior and does the expected thing.

Community
  • 1
  • 1
bames53
  • 86,085
  • 15
  • 179
  • 244
  • "whereas in C++98 the value assigned to j is indeterminate but the behavior is not undefined" - this is incorrect. C++98 says that the previous value of "i" shall only be read to determine the value to be stored in a modification when there's no intervening sequence point. But in "(i+=1) + i", it is unspecified whether the "+ i" reads the previous or the next value of "i", so the consequence is there's undefined behavior. – Johannes Schaub - litb May 18 '12 at 20:21
  • @JohannesSchaub-litb Could you give me a citation so I can see it in the C++98 standard in context? – bames53 May 18 '12 at 20:36
  • it is right after the text you quoted that ends with "... shall have its stored value modified at most once by the evaluation of an expression.". – Johannes Schaub - litb May 18 '12 at 20:38
  • @JohannesSchaub-litb I think you're probably right that the language was intended to mean that and that the requirement I had thought was new in C++11 isn't actually new. But the C++98 language definitely needed cleaning up because as written, "the prior value shall be accessed only to [...]," does not mean the same thing as "only the prior value shall be accessed." – bames53 May 18 '12 at 21:00
  • @JohannesSchaub-litb However, if the intended meaning is "only the prior value shall be accessed" then does that mean that there's no requirement to access the prior value "only to determine the value to be stored?" I.e. am I allowed to access the prior value for other purposes? – bames53 May 18 '12 at 21:06
  • It does not mean "only the prior value shall be accessed" and should not mean that. I don't know why you think that. It should mean exactly what it says - you may only access the prior value if that access is merely to determine the new value to be stored. Accessing the new value is fine. – Johannes Schaub - litb May 18 '12 at 21:21
  • @JohannesSchaub-litb But then isn't `(i+=1) + i` simply unspecified behavior because It only modifies `i` once between sequence points and, if the right hand happens to be evaluated first, accesses the prior value of `i` only to determine the value to be stored? And if the right hand is evaluated second then the prior value isn't accessed at all and so the restriction that "the prior value shall be accessed only [...]" doesn't apply (whereas a restriction that "only the prior value shall be accessed" would make it UB). – bames53 May 18 '12 at 21:58
  • @JohannesSchaub-litb Oh, I think I understand now. It specifically means the value to be stored by the one allowed modification, not just any value to be stored by any further evaluation the the whole expression. In other words the modifying operation can access the prior value, but nothing else can. – bames53 May 18 '12 at 22:12
  • @JohannesSchaub-litb The confusion was that I was interpreting "to determine the value to be stored," not as "to determine the value stored _by the previously mentioned modifying operation_," but as "to determine _some_ value to be stored _somewhere_." Under that wrong interpretation the right hand side of `(i+=1) + i` constitutes accessing the prior value of `i` to determine the value that will be stored in, for example, `j` in `j = (i+=1) + i;` – bames53 May 18 '12 at 22:33
  • @bames53: "*the C++98 language definitely needed cleaning up*" -- yes, and the cleanup is called C++11. – Keith Thompson Jun 18 '13 at 18:26
  • Does the "value computation" of `(i+=10)` include the side effect of modifying `i`? In other words, is the side effect of the second `+=` necessarily *sequenced after* the side effect of the first `+=`? In C++98 terms, the side effect can occur any time before the next sequence point. Surely an implementation *could* compute the result of `i+=10` (i.e., `i+1`) without actually modifying `i`; is that permitted in the abstract machine? – Keith Thompson Jun 18 '13 at 18:31
  • @KeithThompson In C++11 the side effect, i.e. the assignment to `i`, is separate from _and sequenced before_ the value computation of the assignment expression `i+=10`. The assignment is also sequenced after the value computation of either the left or right sides. See 5.17/1 – bames53 Jun 18 '13 at 18:46
20

In C++11 the expression is well defined and will result in i == 20.

From [expr.ass]/1:

In all cases, the assignment is sequenced after the value computation of the right and left operands, and before the value computation of the assignment expression.

This means that the assignment i+=1 is sequenced before the value computation of the left hand side of (i+=10)+=10, which is in turn sequenced before the final assignment to i.


In C++03 the expression has undefined behavior, because it causes i to be modified twice with no intervening sequence point.

Mankarse
  • 39,818
  • 11
  • 97
  • 141
  • 3
    Does this answer apply to C++11 only, or to C++ in general ? – Paul R May 18 '12 at 15:22
  • 1
    @PaulR: C++11 only. C++03 had no notion of "sequenced after", it used sequence points. And in fact, in C++03 it is undefined, because there's no intervening sequence point between the assignments. –  May 18 '12 at 15:23
  • 1
    Thanks - this should really be clarified in the answer then, as the question does not specify C++11. – Paul R May 18 '12 at 15:25
  • @PaulR In C++98 the paragraph says "The result of the assignment operation is the value stored in the left operand after the assignment has taken place;" I'm not sure if that technically means there's a sequence point between assignments. At the very least I think this is under-specified in C++98. – bames53 May 18 '12 at 15:27
  • ... maybe it's worthy to mention that because the result of i+=10 is lvalue (i.e the "i" object itself) the modification of i cannot be treated as "independed" side effect (which is not sequenced with anything until final ";" ) – user396672 May 18 '12 at 15:44
  • @bames53 it just says that the assignment has been taken place before further evaluation takes place on the result. That doesn't formally mean that there is a sequence point (there could be side effects besides the assignment, and those will not have been taken place necessarily before further evaluation takes place). As little sense as it makes, it still is UB because of the two assignments conflicting, even though formally they cannot conflict because of this sequencing (good that this has been cleaned up in C++11). – Johannes Schaub - litb May 18 '12 at 20:13
  • @JohannesSchaub-litb Yeah, that was the conclusion I reached and included in my own answer. Though it's definitely not the most straightforward thing to have two different concepts of ordering where one of them counts for avoiding UB and the other doesn't. The C++11 concepts are a definite improvement in that regard. – bames53 May 18 '12 at 20:24
  • What it does guarantee you though, is that `j = i = 0` works: You read the value of `i` and write to it without an intervening sequence point. However you don't read the previous value of `i`, but the current value as affected by the side effect. If the text wouldn't say "... after the assignment has taken place", this would not work and would be UB. – Johannes Schaub - litb May 19 '12 at 17:20
13

Maybe. It depends on C++ version.

In C++03, it's an obvious UB, there's no intervening sequence point between the assignments.

In C++11, as Mankarse explains, it's not undefined anymore — the parenthesized compound assignment is sequenced before the outer one, so it's okay.

  • 8
    The C++03 answer raises a new question -- why did the assignment operator return an lvalue in C++03 if it was indeed undefined to modify its result? – Mankarse May 18 '12 at 15:32
  • 3
    @Mankarse: Now, that's a *great* question. – NPE May 18 '12 at 15:33
  • 3
    @Mankarse: presumably for consistency with simple assignment, so you can do both `f(x = 5)` and `f(x += 5)` with `f` wanting a `T&`. –  May 18 '12 at 15:35
  • 3
    @Fanael: I know you're just guessing, but I don't follow your logic. If the assignment operators did not return an _lvalue_, then neither `f(x = 5)` nor `f(x += 5)` would compile (for `f` wanting a `int&`). How would that be any less consistent? – Mankarse May 18 '12 at 15:56
  • @Mankarse: right, I didn't think of that. In this case I think we should just stick to "nobody knows why". –  May 18 '12 at 16:00
  • 5
    @Mankarse The given justification (which I don't really agree with) is to support things like `int& f(int& i) { return i += 2; }` – James Kanze May 18 '12 at 16:27