23

Take these three snippets of C code:

1) a = b + a++
2) a = b + a; a++
3) a = b + a, a++

Everyone knows that example 1 is a Very Bad Thing, and clearly invokes undefined behavior. Example 2 has no problems. My question is regarding example 3. Does the comma operator work like a semicolon in this kind of expression? Are 2 and 3 equivalent or is 3 just as undefined as 1?

Specifically I was considering this regarding something like free(foo), foo = bar. This is basically the same problem as above. Can I be sure that foo is freed before it's reassigned, or is this a clear sequence point problem?

I am aware that both examples are largely pointless and it makes far more sense to just use a semicolon and be done with it. I'm just asking out of curiosity.

dbush
  • 205,898
  • 23
  • 218
  • 273
Roflcopter4
  • 679
  • 6
  • 16
  • 1
    Suggested reading: [What does i = (i, ++i, 1) + 1; do?](https://stackoverflow.com/q/30614396/2455888) – haccks May 03 '18 at 06:31
  • 2
    Note that the comma here is very different from the comma used to separate arguments in function calls, as in: `func(b+a, a++);`. – ComicSansMS May 03 '18 at 07:00
  • 1
    This depends on language and in case of C++, language version. – Lundin May 03 '18 at 09:25

2 Answers2

39

Case 3 is well defined.

First, let's look at how the expression is parsed:

a = b + a, a++

The comma operator , has the lowest precedence, followed by the assignment operator =, the addition operator + and the postincrement operator ++. So with the implicit parenthesis it is parsed as:

(a = (b + a)), (a++)

From here, section 6.5.17 of the C standard regarding the comma operator , says the following:

2 The left operand of a comma operator is evaluated as a void expression; there is a sequence point between its evaluation and that of the right operand. Then the right operand is evaluated; the result has its type and value

Section 5.14 p1 of the C++11 standard has similar language:

A pair of expressions separated by a comma is evaluated left-to-right; the left expression is a discarded- value expression. Every value computation and side effect associated with the left expression is sequenced before every value computation and side effect associated with the right expression. The type and value of the result are the type and value of the right operand; the result is of the same value category as its right operand, and is a bit-field if its right operand is a glvalue and a bit-field.

Because of the sequence point, a = b + a is guaranteed to be fully evaluated before a++ in the expression a = b + a, a++.

Regarding free(foo), foo = bar, this also guarantees that foo is free'ed before a new value is assigned.

dbush
  • 205,898
  • 23
  • 218
  • 273
  • 1
    So you're saying this expression is evaluated as `(a = b + a), a++` rather than `a = (a + b, a++)`? – Mark Ransom May 03 '18 at 04:10
  • 3
    @MarkRansom Correct. The comma operator has lower precedence than assignment, and the addition operator has higher precedence than assignment. – dbush May 03 '18 at 04:12
  • @MarkRansom Yes, see [operator precedence in C](http://en.cppreference.com/w/c/language/operator_precedence) point 14 and 15. – Ajay Brahmakshatriya May 03 '18 at 04:13
  • 3
    @AjayBrahmakshatriya I could have looked it up myself, but I wanted an explanation associated with the answer. It's not intuitively obvious. – Mark Ransom May 03 '18 at 04:29
  • 1
    I suppose RTFM is usually the answer, but the C standards are rather voluminous. I appreciate the clear explanation and the reference. This is actually useful knowledge. It lets you do more than one thing in an if block without adding braces. – Roflcopter4 May 03 '18 at 07:09
  • 1
    That assumes of course no evil overloads of `opertor,`. – Jarod42 May 03 '18 at 08:00
  • 1
    @MarkRansom have you any sane reason in mind why this would be parsed as `a = (a + b, a++)`? That makes a bit of sense in Python where the comma actually builds a tuple (though even there it would IMO be better if the parens were required), but not in C++ where the LHS of the comma is just thrown away. – leftaroundabout May 03 '18 at 09:31
  • I would normally argue for syntax sugar "to taste", but I have to say that anyone overloading `operator,` is asking for cavities. @leftaroundabout Are you asking within the context of order of operations as they are, or asking for a sane reason they should be ordered any other way? – John P May 03 '18 at 09:58
  • @Baldrickk Nothing. The only time it's a bother is when I really, really don't care about the code I am writing and don't feel like wasting several whole seconds (horror!) retroactively adding braces to a whole if else chain just because one part of it needs to define two things. – Roflcopter4 May 03 '18 at 10:20
  • @JohnP I don't know, thus my question: why would somebody think the `,` might take precedence? Of course ultimately the only reason is “because the hierarchy is such and such”, but the hierarchy was designed for making C code well-readable and even for modern C++ I'd always memorise it with something along “`<` takes precedence over `&&` so I can write `x<3 && y<4`”. For the comma operator, it would seem really odd if the C designers had made it bind more tightly than anything else. – leftaroundabout May 03 '18 at 10:32
  • 1
    @leftaroundabout my intuition said that *everything* would have a higher precedence than `=`. Python has nothing to do with it. The comma operator is such a strange beast. – Mark Ransom May 03 '18 at 11:43
  • 3
    My *guess* is that comma has lower precedence so that things like `for (Node *curr=head, prev=NULL; curr; prev=curr, curr=curr->next)` work well – Martin Bonner supports Monica May 03 '18 at 13:50
12

a = b + a, a++; is well-defined, but a = (b + a, a++); can be undefined.

First of all, the operator precedence makes the expression equivalent to (a = (b+a)), a++;, where + has the highest precedence, followed by =, followed by ,. The comma operator includes a sequence point between the evaluation of its left and right operand. So the code is, uninterestingly, completely equivalent to:

a = b + a;
a++;

Which is of course well-defined.


Had we instead written a = (b + a, a++);, then the sequence point in the comma operator wouldn't save the day. Because then the expression would have been equivalent to

(void)(b + a);
a = a++;
  • In C and C++14 or older, a = a++ is unsequenced , (see C11 6.5.16/3). Meaning this is undefined behavior (Per C11 6.5/2). Note that C++11 and C++14 were badly formulated and ambiguous.
  • In C++17 or later, the operands of the = operator are sequenced right to left and this is still well-defined.

All of this assuming no C++ operator overloading takes place. In that case, the parameters to the overloaded operator function will be evaluated, a sequence point takes place before the function is called, and what happens from there depends on the internals of that function.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • Interesting that C++ would change that. `a = a++` is one of the most classic gotcha's of C programming. I know a bit of C, but couldn't write anything much more complicated than a hello world in C++, so overloading is pretty foreign to me. I also see that you didn't much like my chess reference/joke. – Roflcopter4 May 03 '18 at 10:24
  • @Roflcopter4 There's really no reason why this would be undefined behavior - it is a shortcoming of the standard committees. It took the C++ committee 30 years to realize. – Lundin May 03 '18 at 10:45
  • 2
    @Lundin -- it's not a matter of realizing it, but of caring about it. Why would someone **decide to** write `a = a++`? And even if it's well defined, why would it survive a code review? – Pete Becker May 03 '18 at 12:05
  • 1
    @PeteBecker It might for example make perfect sense to write code like `while(i – Lundin May 03 '18 at 12:26
  • Also, the single-most asked C question on SO is "why does `[messy code with some flavour of i = i++]` not work as I expect?". No other FAQ is remotely near the frequency of this one. The most used canonical duplicate by far is [Why are these constructs (using ++) undefined behavior in C?](https://stackoverflow.com/questions/949433/why-are-these-constructs-using-undefined-behavior-in-c). So there is no question that people do write code like that, all the time, over and over. – Lundin May 03 '18 at 12:33
  • @Lundin -- different isn't the same. **Your answer** talks about a = a++`; if that's not what you meant to talk about, then you need to change your answer. Making `a = a++' well defined is pointless in its own right. If it falls out from other changes, so be it, but it is still not something people should be encourage to write. – Pete Becker May 03 '18 at 12:49
  • 2
    @PeteBecker It is the very same thing, because the issue is, as mentioned in the answer, that the evaluations of the operands of the assignment operator are not sequenced. This is true for any use of the assignment operator no matter what the operands happen to be. – Lundin May 03 '18 at 12:58
  • @Lundin -- "because of other language flaws" -- in other words, because you don't like the consequences. That's fine -- argue for your preferences. When you make claims like "it is a shortcoming of the standard committee" you're turning your (one-sided) technical argument into personal attacks, and that's out of line. – Pete Becker May 03 '18 at 13:08
  • @PeteBecker Just to clarify for everyone - the problematic formulation "shortcoming" is in a comment, not in the answer itself. I assume the issues with the C++11 and C++14 are actual technical details. I.e. the answer is fine in itself. – Hans Olsson May 03 '18 at 13:17
  • 1
    @PeteBecker This is silly. This is a well-known defect in the language which is the whole reason why the C++ committee fixed it in C++17 (reference C++17 draft, 8.18/1, this text was added: `The right operand is sequenced before the left operand.`). Similarly, when Java was designed (in the late 90s), they knew of this language issue and therefore didn't make the same mistake in Java. And so on. – Lundin May 03 '18 at 14:47
  • @Lundin -- it is, indeed, silly that you keep repeating your (one-sided) technical argument in response to a complaint about the **personal attack** that you made on all members of the ++ standards committee. (And citing Java as an example of good design doesn't help your argument ) – Pete Becker May 03 '18 at 15:10
  • @PeteBecker The standard committee is an institution (ultimately ISO), not a person. If they can't stand criticism, then they shouldn't work with industry standards distributed to millions of people. Every single case of known poorly-defined behavior in a language standard is a failure, that's just how it is. – Lundin May 03 '18 at 15:26
  • That being said, I have never said that Java or C++ were examples of good language design. This isn't the place for language wars. – Lundin May 03 '18 at 15:27
  • @Lundin: I hardly consider it obvious that the resolution of the left-side lvalue in an assignment should be always performed before the computation of the right-side value, nor that it should always be performed after. If one half of the assignment involves a load and the other half involves a function call (e.g. `*p = foo();` or `*(foo()) = x;`) performance will often be optimized by doing the function call first, regardless of which side it happens to be on. – supercat May 04 '18 at 22:47