28

If the value of the variable x is initially 0, the expression x += x += 1 will evaluate to 2 in C, and to 1 in Javascript.

The semantics for C seems obvious to me: x += x += 1 is interpreted as x += (x += 1) which is, in turn, equivalent to

x += 1
x += x  // where x is 1 at this point

What is the logic behind Javascript's interpretation? What specification enforces such behaviour? (It should be noted, by the way, that Java agrees with Javascript here).

Update: It turns out the expression x += x += 1 has undefined behaviour according to the C standard (thanks ouah, John Bode, DarkDust, Drew Dormann), which seems to spoil the whole point of the question for some readers. The expression can be made standards-compliant by inserting an identity function into it as follows: x += id(x += 1). The same modification can be made to the Javascript code and the question still remains as stated. Presuming that the majority of the readers can understand the point behind "non-standards-compliant" formulation I'll keep it as it is more concise.

Update 2: It turns out that according to C99 the introduction of the identity function is probably not solving the ambiguity. In this case, dear reader, please regard the original question as pertaining to C++ rather than C99, where "+=" can be most probably now safely be regarded as an overloadable operator with a uniquely defined sequence of operations. That is, x += x += 1 is now equivalent to operator+=(x, operator+=(x, 1)). Sorry for the long road to standards-compliance.

Community
  • 1
  • 1
KT.
  • 10,815
  • 4
  • 47
  • 71
  • 34
    It's an expression with multiple intertwingled side effects. Why does it matter? Don't write code like that. – Greg Hewgill Jan 23 '12 at 22:54
  • In JavaScript, you are taking zero, adding it to itself, and then adding one. I'm not sure what `specification` makes this behavior but that seems to make sense to me reading left to right. – Jasper Jan 23 '12 at 22:54
  • @Jasper. The += operator is right-associative. Hence, you should read it right to left, as x += (x += 1). For simplicity, we may assume I've put those brackets there. – KT. Jan 23 '12 at 22:56
  • 2
    @Greg. I am not suggesting anyone should write such code. The question is about programming language semantics and the inner workings of an interpreter/compiler. I feel there is something interesting happening (it is as if Javascript fixes all references to the variables in a statement and clones them before executing the line), but I can't explain it to myself yet. – KT. Jan 23 '12 at 23:04
  • Also note that `x=1;x+=x+=1` sets x to 3... Interesting. – Kenan Banks Jan 23 '12 at 23:05
  • Which JS engine are you using? – PLG Jan 23 '12 at 23:10
  • Throwing JS into the mix makes for a different flavour of `x++ + x++ + --x++` question. – David Heffernan Jan 23 '12 at 23:18
  • 3
    It evaluates differently because it are different languages with different rules. – sth Jan 23 '12 at 23:21
  • The "identity" function doesn't help: `x += id(x += 1)` exhibits unspecified behavior that may yield undefined behavior. The call of the identity function injects a sequence point into the expression, but the read of `x` on the left-hand side is still unsequenced with respect to the evaluation of `x += 1`. If the `x` on the left-hand side is evaluated after the `x += 1`, the behavior is well-defined. However, if the `x` on the left-hand side is evaluated before the `x += 1`, the behavior is undefined, because `x` is read and modified in unrelated operations without sequencing. – James McNellis Jan 24 '12 at 00:28
  • @James McNellis. Firstly. I would be grateful if you could point to a clause in the specification, which implies that the LHS of the += may be read before the RHS is evaluated. Secondly, if it happens to be the case, it will still not change the point of the question as it stands. We may take C++ and come up with something like x.operator+=(x.operator+=(1)) which will express the same idea. Thirdly, I think your comment hides in it a legible answer to the whole question - evaluation order. – KT. Jan 24 '12 at 00:47
  • 3
    John Bode's answer already includes the important sentence from the spec: "the order of evaluation of subexpressions and the order in which side effects take place are both unspecified." One valid evaluation of `x += id(x += 1)` would be (1) read `x` on LHS, (2) evaluate `x += 1` on RHS, [Sequence Point] (3) evaluate `id(x += 1)`, (4) evaluate `x += id(x += 1)`. In this order of evaluation, the behavior is undefined because it violates "the prior value shall be read only to determine the value to be stored" (also from John Bode's answer), because `x` is read on the LHS but modified on the RHS. – James McNellis Jan 24 '12 at 00:55
  • 3
    To address your second point (`x.operator+=(x.operator+=(1))` in C++), if an operator overload is called, the behavior is well-defined because an operator overload is a function and function calls have stricter sequencing requirements. As for answering the question, I know almost nothing about JavaScript, and as has already been sufficiently noted, the behavior in C (and C++) is undefined (or unspecified and potentially undefined). See also the Stack Overflow C++ FAQ ["Undefined Behavior and Sequence Points."](http://stackoverflow.com/questions/4176328) – James McNellis Jan 24 '12 at 00:59
  • 1
    If you're really taking C99 out of scope and putting C++ in scope, I suggest editing the tags to remove [tag:c] and add [tag:c++]. Maybe a version-specific C++-tag if the functionality that allows `i += i += 1` was introduced with a specific version of the standard. – sarnold Jan 24 '12 at 02:44

6 Answers6

27

x += x += 1; is undefined behavior in C.

The expression statement violates sequence points rules.

(C99, 6.5p2) "Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression."

ouah
  • 142,963
  • 15
  • 272
  • 331
  • 3
    Even if it is indeed undefined behaviour (which, I think, is not completely obvious from the presented paragraph), I've tested this in a couple of C compilers and the expression behaves as expected. Thus, my question is more about what Javascript is doing here. Again, both Chrome's and Firefox interpreters seem to agree on the interpretation. – KT. Jan 23 '12 at 23:09
  • 3
    Someone should note that this is only half of an answer. – Kenan Banks Jan 23 '12 at 23:09
  • The fact that it is undefined behavior in C means comparing the result with anything defined or undefined makes no sense. – ouah Jan 23 '12 at 23:12
  • 1
    "Thus, my question is more about what Javascript is doing here." Really you should ask one question about JS and one about C. They are different languages. – David Heffernan Jan 23 '12 at 23:12
  • 8
    @KT: It is most definitely UB and it only behaves "as expected" if by "as expected" you mean "as KT expects". On the basis of the C language there is no reason to expect it to have any meaningful result or not to crash your computer. – R.. GitHub STOP HELPING ICE Jan 23 '12 at 23:13
  • @David: The difference between those languages is what the question is about. – KT. Jan 23 '12 at 23:21
  • @R: OK, OK, stop torturing me with this undefined behaviour thing. I felt it was obvious for most people that (x += (x += 1)) has an "expected" evaluation order. If the sequence points are a problem I could have stated the question as (x += id(x += 1)) where id is an identity function. This would resolve the sequence point problem, yet the question would still stand exactly as stated. – KT. Jan 23 '12 at 23:23
  • @KT. Feel free to edit the question so that the C code is not UB. – David Heffernan Jan 23 '12 at 23:25
  • @KT the thing is: `x += id(x += 1);` is also UB for the same reasons. – ouah Jan 23 '12 at 23:26
  • @ouah: No, it isn't. `x += id(x += 1);` is equivalent to `x+=1; t1 = id(x); x+=t1;`. Remember you've got sequence points before and after you call a function. – jpalecek Jan 23 '12 at 23:31
  • 4
    @jpalecek this one is a trap. There is a sequence point at the function call which means all side effects happen before the function's body is being executed but the order of evaluation of the operands of the assignment operator are still unspecified. – ouah Jan 24 '12 at 00:14
  • @ouah. The operand to id must be evaluated before id, including all the side effects (it does not matter that it is an assignment, I could also have had my own function inc(&x) there). The += assignment of id output must happen after id is invoked. What trap are you talking about? – KT. Jan 24 '12 at 00:25
  • @ouah: For example, consider the code myop(x, myop(x, 1)) where myop(int& x, int y) is defined appropriately. This is well defined, isn't it? Now just inline += instead of myop. – KT. Jan 24 '12 at 00:29
  • 6
    "Now just inline += instead of myop." When you make that change, the behavior becomes undefined (or, when using the `id` function, the behavior becomes unspecified and potentially undefined). Function calls have stronger sequencing requirements than fundamental operator evaluation. – James McNellis Jan 24 '12 at 00:33
15

JavaScript and Java have pretty much strict left-to-right evaluation rules for this expression. C does not (even in the version you provided that has the identity function intervening).

The ECMAScript spec I have (3rd Edition, which I'll admit is quite old – the current version can be found here: http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf) says that compound assignment operators are evaluated like so:

11.13.2 Compound Assignment ( op= )

The production AssignmentExpression : LeftHandSideExpression @ = AssignmentExpression, where@ represents one of the operators indicated above, is evaluated as follows:

  1. Evaluate LeftHandSideExpression.
  2. Call GetValue(Result(1)).
  3. Evaluate AssignmentExpression.
  4. Call GetValue(Result(3)).
  5. Apply operator @ to Result(2) and Result(4).
  6. Call PutValue(Result(1), Result(5)).
  7. Return Result(5)

You note that Java has the same behavior as JavaScript – I think its spec is more readable, so I'll post some snippets here (http://java.sun.com/docs/books/jls/third_edition/html/expressions.html#15.7):

15.7 Evaluation Order

The Java programming language guarantees that the operands of operators appear to be evaluated in a specific evaluation order, namely, from left to right.

It is recommended that code not rely crucially on this specification. Code is usually clearer when each expression contains at most one side effect, as its outermost operation, and when code does not depend on exactly which exception arises as a consequence of the left-to-right evaluation of expressions.

15.7.1 Evaluate Left-Hand Operand First The left-hand operand of a binary operator appears to be fully evaluated before any part of the right-hand operand is evaluated. For example, if the left-hand operand contains an assignment to a variable and the right-hand operand contains a reference to that same variable, then the value produced by the reference will reflect the fact that the assignment occurred first.

...

If the operator is a compound-assignment operator (§15.26.2), then evaluation of the left-hand operand includes both remembering the variable that the left-hand operand denotes and fetching and saving that variable's value for use in the implied combining operation.

On the other hand, in the not-undefined-behavior example where you provide an intermediate identity function:

x += id(x += 1);

while it's not undefined behavior (since the function call provides a sequence point), it's still unspecified behavior whether the leftmost x is evaluated before the function call or after. So, while it's not 'anything goes' undefined behavior, the C compiler is still permitted to evaluate both x variables before calling the id() function, in which case the final value stored to the variable will be 1:

For example, if x == 0 to start, the evaluation could look like:

tmp = x;    // tmp == 0
x = tmp  +  id( x = tmp + 1)
// x == 1 at this point

or it could evaluate it like so:

tmp = id( x = x + 1);   // tmp == 1, x == 1
x = x + tmp;
// x == 2 at this point

Note that unspecified behavior is subtly different than undefined behavior, but it's still not desirable behavior.

Ry-
  • 218,210
  • 55
  • 464
  • 476
Michael Burr
  • 333,147
  • 50
  • 533
  • 760
  • 1
    Thanks! As I summarize this for myself now is that Java/Javascript interpret the expression `x += y` as being equivalent to `x = x + y` with the left-to-right evaluation order imposed, while the C/C++ compilers I've tried (despite the standard not being strict here) interpret the expression `x += y` as being equivalent to something like `operator+=(x, y)`. I've gotten very used to the latter way of thinking hence the surprise and this question. – KT. Jan 24 '12 at 01:27
  • 4
    @KT: I don't think your summary of C/C++'s behavior is right (leave aside overloading of the operator in C++). C evaluates `x += y` as `x = x + y` with `x` being evaluated only once (only important if the left side has a side-effect). But with a bunch of caveats that are especially important if the expression is part of a larger expression, such as order of evaluation issues and the undefined behavior issues brought up in various answers and comments. You seem to discount those comments, but they are core to the fact that in C, `x += x += 1` is buggy code with with a meaningless result. – Michael Burr Jan 24 '12 at 01:41
  • 1
    I do understand that any particular standard is free to define things one way or the other, or leave them unspecified. My question was not whether the code is complying with standard X, but rather about the semantics which (could) produce one or the other behaviour. It is easy to observe that, irrespectively of the specification ambiguities, most C/C++ compiler makers seem to interpret things in one particular way, which, as it turns out, did not match the (implicit or explicit) semantics of some other languages with the same syntax. "How can it be" was my question. Now I know. – KT. Jan 24 '12 at 02:11
  • Nice answer, and congrats on hitting six digits of rep! – James McNellis Jan 24 '12 at 21:42
  • I am not convinced about the `id()` being correct. GCC 7.4.0 complains with `-Wall`. – Antti Haapala -- Слава Україні Nov 20 '19 at 15:41
9

In C, x += x += 1 is undefined behavior.

You can not count on any result happening consistently because it is undefined to try to update the same object twice between sequence points.

Drew Dormann
  • 59,987
  • 13
  • 123
  • 180
  • 3
    Why is this getting so many upvotes? I don't see a shred of evidence, sorry! There is a list of operator precedences and associativity at [Wikipedia](http://en.wikipedia.org/wiki/Operators_in_C_and_C%2B%2B#Operator_precedence) and is appears that `+=` has right-to-left associativity. So it is might be defined. Standard quotes? – Aaron McDaid Jan 23 '12 at 22:59
  • I'm busted! (My sources were other SO posts) Edited now. – Drew Dormann Jan 23 '12 at 23:01
  • 3
    @Triptych (C99, 6.5p2) "Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression." – ouah Jan 23 '12 at 23:02
  • @helloandre, language specs typically don't have any ambiguity here. For example, `a=b=c` is clearly defined to be `b=c; a=b`. Compilers are not allowed to put the parenthese anywhere else. I don't see why `+=` would be an exception in C. – Aaron McDaid Jan 23 '12 at 23:02
  • What's a 'sequence point', @ouah? Is it what many would informally refer to as an expression? Is `x+=x+=1` a single sequence point? – Aaron McDaid Jan 23 '12 at 23:04
  • 1
    @AaronMcDaid Google `sequence point`. For C++11 you will need to look for `sequenced after/before`. This is clearly undefined in C and C++. – pmr Jan 23 '12 at 23:05
  • 2
    @aaronMcDaid He's completely right about C, can't say that same thing about Javascript. – jn1kk Jan 23 '12 at 23:05
  • @AaronMcDaid Start here: http://stackoverflow.com/questions/4176328/undefined-behavior-and-sequence-points – David Heffernan Jan 23 '12 at 23:11
  • 1
    @AaronMcDaid: This is a very frequently asked question in C here on SO, it has been answered *dozens* of times. Drew is absolutely correct in that it's undefined behavior, as is `x = x++ + x++;` since `x` is modified twice between two sequence points. – DarkDust Jan 23 '12 at 23:14
  • 2
    @Aaron: The issue has nothing to do with parentheses or order of evaluation. It has to do with order of side effects taking place, which is intentionally not specified. The sequence point rules serve to explicitly make any expression whose side effects could have order-dependent interactions **undefined behavior**. – R.. GitHub STOP HELPING ICE Jan 23 '12 at 23:15
  • That's a lot of comments directed at me folks. Thanks for the repitition :-) – Aaron McDaid Jan 24 '12 at 00:04
  • ... anyway, Stack Overflow isn't just about giving correct answers, it's about giving some sort of information and evidence. Even without quotes from the standard, it is quite possible to give more details. – Aaron McDaid Jan 24 '12 at 00:06
  • 1
    @Aaron, the question is vastly uninteresting and it's clearly undefined behavior in C, it has been asked so many times in different forms. When you see double+ assignment/modification in double+ sequence forms, it's undefined. Usually it's easy to spot as it looks just awkward. – bestsss Jan 28 '12 at 09:15
  • @bestsss, Only one person has cast a vote to close this question. If this question did appear very often, then more people would have voted to close. – Aaron McDaid Jan 28 '12 at 12:09
5

At least in C, this is undefined behavior. The expression x += x+= 1; has two sequence points: an implicit one right before the expression starts (that is: the previous sequence point), and then again at the ;. Between these two sequence points x is modified twice and this explicitly stated as undefined behavior by the C99 standard. The compiler is free to do anything it likes at this point, including making daemons fly out of your nose. If you're lucky, it simply does what you expect but there is simply no guarantee for that.

This is the same reason why x = x++ + x++; is undefined in C. See also the C-FAQ for more examples and explanations of this or the StackOverflow C++ FAQ entry Undefined Behavior and Sequence Points (AFAIK the C++ rules for this are the same as for C).

Community
  • 1
  • 1
DarkDust
  • 90,870
  • 19
  • 190
  • 224
5

Several issues are at play here.

First and most important is this part of the C language specification:

6.5 Expressions
...
2 Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression.72) Furthermore, the prior value shall be read only to determine the value to be stored.73)
...
72) A floating-point status flag is not an object and can be set more than once within an expression.

73) This paragraph renders undefined statement expressions such as
    i = ++i + 1;
    a[i++] = i;
while allowing
    i = i + 1;
    a[i] = i;

Emphasis mine.

The expression x += 1 modifies x (side effect). The expression x += x += 1 modifies x twice without an intervening sequence point, and it's not reading the prior value only to determine the new value to be stored; hence, the behavior is undefined (meaning any result is equally correct). Now, why on Earth would that be an issue? After all, += is right-associative, and everything's evaluated left-to-right, right?

Wrong.

3 The grouping of operators and operands is indicated by the syntax.74) Except as specified later (for the function-call (), &&, ||, ?:, and comma operators), the order of evaluation of subexpressions and the order in which side effects take place are both unspecified.
...
74) The syntax specifies the precedence of operators in the evaluation of an expression, which is the same as the order of the major subclauses of this subclause, highest precedence first. Thus, for example, the expressions allowed as the operands of the binary + operator (6.5.6) are those expressions defined in 6.5.1 through 6.5.6. The exceptions are cast expressions (6.5.4) as operands of unary operators (6.5.3), and an operand contained between any of the following pairs of operators: grouping parentheses () (6.5.1), subscripting brackets [] (6.5.2.1), function-call parentheses () (6.5.2.2), and the conditional operator ?: (6.5.15).

Emphasis mine.

In general, precedence and associativity do not affect order of evaluation or the order in which side effects are applied. Here's one possible evaluation sequence:

 t0 = x + 1 
 t1 = x + t0
 x = t1
 x = t0

Oops. Not what we wanted.

Now, other languages such as Java and C# (and I'm assuming Javascript) do specify that operands are always evaluated left-to-right, so there's always a well-defined order of evaluation.

John Bode
  • 119,563
  • 19
  • 122
  • 198
  • I edited the question to take the standards compliance out of focus. After all, this is not what the whole problem is about. – KT. Jan 23 '12 at 23:49
4

All JavaScript expressions are evaluated left to right.

The associativity of...

var x = 0;
x += x += 1

will be...

var x = 0;
x = (x + (x = (x + 1)))

So because of its left to right evaluation, the current value of x will be evaluated before any other operation takes place.

The result could be viewed like this...

var x = 0;
x = (0 + (x = (0 + 1)))

...which will clearly equal 1.

So...

   var x = 0;
   x = (x + (x = (x + 1)));
// x = (0 + (x = (0 + 1)));  // 1


   x = (x + (x = (x + 1)));
// x = (1 + (x = (1 + 1)));  // 3


   x = (x + (x = (x + 1)));
// x = (3 + (x = (3 + 1)));  // 7
Ry-
  • 218,210
  • 55
  • 464
  • 476