Comma operator in C++11 (sequencing)

Question

The standard mentions f(a,(t=3,t+2),c); which would be an assignment-expression followed by an expression for the 2nd operator according to my understanding.

But the grammar lists it juxtaposed:

expression:

assignment-expression

expression, assignment-expression

Working Draft, Standard for Programming Language C ++ Revision N4140 (November 2014)

Is someone so nice as to explain to me please what it is that I'm missing here?

First, an *assignment-expression* cannot contain a comma (outside of brackets or quotes, that is). Therefore, the *assignment-expression* to the left of the operator cannot be extended to the right of ``t=3``. Second, and *assignment-expression* is one that *may* contain an assignment, it doesn't have to. So, each side of the operator is technically an *assignment-expression*. This is just the weird but very useful world of context-free grammars (see https://en.wikipedia.org/wiki/Context-free_grammar). The names of these rules often don't match what you would expect from natural language. — Arne Vogel, Sep 22 '17 at 10:11

score 10 · Accepted Answer · answered Sep 21 '17 at 10:48

When you see

 expression:
    assignment-expression
    expression, assignment-expression

It mean that there are 2 possibilities for expression. One possibility that it is just assignment-expression that is defined somewhere earlier. Or it is recursively represented as expression, assignment-expression

So after extending it you receive that expression is comma separated list of one or more assignment-expression tokens.

In the sample you're mentioned second parameter is expression (t=3,t+2) which consists of 2 comma-separated assignment-expressions - and since it appears "In contexts where comma is given a special meaning" it has to "appear only in parentheses".

To find out why assignment-expression could take a form of t+2 you have to go back from its definitions and choose first choice always

assignment-expression
-> conditional-expression
--> logical-or-expression
---> logical-and-expression
----> inclusive-or-expression
-----> exclusive-or-expression
------> and-expression
-------> equality-expression
--------> relational-expression
---------> shift-expression
----------> additive-expression - this is what you see

score 5 · Answer 2 · edited Jun 20 '20 at 09:12

Note that since the definition of expression is

expression:

assignment-expression

expression , assignment-expression

the second line means that any assignment-expression can be considered an expression, which is why t=3, t+2 is a valid expression.

So why is the grammar this way? First note that the grammar for expressions builds its way in steps from the most tightly bound category primary-expression to the least tightly bound category expression. (And then the fact that "( expression )" is a primary-expression brings the expression grammar full circle and lets us cause any expression to be more tightly bound than everything that surrounds it by adding parentheses.)

For example, the well-known fact that binary * binds tighter than binary + follows from these grammar pieces:

multiplicative-expression:

pm-expression

multiplicative-expression * pm-expression

multiplicative-expression / pm-expression

multiplicative-expression % pm-expression

additive-expression:

multiplicative-expression

additive-expression + multiplicative-expression

additive-expression - multiplicative-expression

In the expression 2 + 3 * 4, the literals 2, 3, and 4 can be considered a pm-expression, or therefore also a multiplicative-expression or additive-expression. So you might say 2 + 3 would qualify as an additive-expression, but it is not a multiplicative-expression, so the full 2 + 3 * 4 can't work that way. Instead the grammar forces 3 * 4 to be considered a multiplicative-expression, so that 2 + 3 * 4 can be an additive-expression. Therefore 3 * 4 is a subexpression of the binary +.

Or in the expression 2 * 3 + 4, 3 + 4 might be considered an additive-expression, but then it is not a pm-expression, so that doesn't work. Instead the parser must recognize that 2 * 3 is a multiplicative-expression, which is also an additive-expression, so 2 * 3 + 4 is a valid additive-expression, with 2 * 3 as a subexpression of the binary +.

The recursive nature of most grammar definitions matters when the same operator is used twice, or two operators with the same precedence are used.

Going back to the comma grammar, if we have the tokens "a, b, c", we might say b, c could be an expression, but it is not an assignment-expression, so b, c cannot be a subexpression of the whole. Instead the grammar requires recognizing a, b as an expression, which is allowed as a left subexpression of another comma operator, so a, b, c is also an expression with a, b as the left operand.

This doesn't make any difference for the built-in comma, since its meaning is associative: "evaluate and discard a, then the result value comes from evaluating (evaluate and discard b, then the result value comes from evaluating c)" does the same as "evaluate and discard (evaluate and discard a, then the result value comes from evaluating b), then the result value comes from evaluating c".

But it does give us a clearly-defined behavior in case of an overloaded operator,. Given:

struct X {};
X operator,(X, X);
X a, b, c;
X d = (a, b, c);

we know that the last line means

X d = operator,(operator,(a,b), c);

and not

X d = operator,(a, operator,(b,c));

(I'd consider it particularly evil to define a non-associative operator,, but it is allowed.)

You wrote: "the second line means that any _assignment-expression_ can be considered an _expression_, which is why `t=3, t+2` is a valid expression." To prove this, you have to show that `t+2` is an _assignment-expression_. Could you show how to do this? — Alexander, Mar 23 '20 at 18:28
In this context, `t` is an _identifier_, _unqualified-id_, _id-expression_, and _primary-expression_. `2` is a _literal_ and _primary-expression_. Each _primary-expression_ is a _postfix-expression_, _unary-expression_, _cast-expression_, _pm-expression_, and _multiplicative-expression_. `t` is also an _additive-expression_, and `t+2` is an _additive-expression_, _shift-expression_, ..., _conditional-expression_, and _assignment-expression_. — aschepler, Mar 23 '20 at 19:15

score 2 · Answer 3 · answered Sep 21 '17 at 10:43

This is the syntax notation (see §1.6 of N4140).

It is mainly used to evaluate precedence, but the name can be misleading.

For example in [expr.ass] (§5.18) you have the folowing definition:

assignment-expression:
   conditional-expression
   logical-or-expression assignment-operator initializer-clause
   throw-expression

assignment-operator: one of
   = *= /= %= += -= >>= <<= &= ^= |=

So an assignment-expression can be a conditional-expression or a throw-expression even if neither performs any assignment.

This just states that a = b, throw 10 or cond ? c : d are expressions with the same precedence order.

Those three expressions do not have the same precedence order. The grammar specifies that `throw` binds tighter than `=`, and `=` to the left of `?:` is looser than the `?:`, and `=` to the right of `?:` binds tighter than the `?:`. — aschepler, Sep 21 '17 at 11:04

Jayesh · Answer 4 · 2017-09-21T10:56:09.873

0

 f(a,(t=3,t+2),c);

Here, first, stores 3 into t variable, then calls function f() with three arguments. It means second argument value become 5 and pass to the function.

edited Sep 21 '17 at 10:56

answered Sep 21 '17 at 10:51

Jayesh

4,755
9
32
62

2

Yes, that's what happens. But how does it match the grammar *expression, assignment-expression* where it looks like the assignment should be *after* the comma? – Bo Persson Sep 21 '17 at 10:56

Comma operator in C++11 (sequencing)

4 Answers4

Linked