8

I'm writing a C compiler which follows this standard, and if I parse statements like this:

int i;
(i) = 1;

my compiler will report an error which point out that (i) is a rvalue and should not be assignable.

I checked the code and the rules, and found this: in assignment expression semantics:

An assignment operator shall have a modifiable lvalue as its left operand.

An assignment expression has the value of the left operand after the assignment, but is not an lvalue.

In my case, the there are two assignment expressions: (i) = 1 and i in parentheses. So the (i) should be a rvalue.

So my question is: Is (i) = 1 illegal in this C standard?

curiousguy
  • 8,038
  • 2
  • 40
  • 58
  • 1
    It's still an lvalue, the wrapping parentheses doesn't change that. – Jeff Mercado Oct 12 '19 at 09:17
  • 1
    Hi all, Please point out which rule in the standard if it's legal, My compiler sticks the rule rigorously. –  Oct 12 '19 at 09:19
  • `i` in parentheses is not an assignment expression. Assignment expression does not mean "expression involved in an assignment" or anything else where `(i)` would qualify. Assignment expressions are *assignments*. – user2357112 Oct 12 '19 at 09:30
  • @user2357112 `i` is an assignment expression, the AST tree is EXPRESSION->ASSIGNMENT_EXPRESSION->CONDITIONAL_EXPRESSION->LOGICAL_OR_EXPRESSION->CAST_EXPRESSION->UNARY_EXPRESSION->POSTFIX_EXPRESSION->PRIMARY_EXPRESSION->IDENTIFIER –  Oct 12 '19 at 11:18
  • @reavenisadesk: An `assignment-expression` grammar nonterminal is not the same thing as an assignment expression. Roughly, an `assignment-expression` is an assignment expression or anything with higher precedence. – user2357112 Oct 12 '19 at 11:26
  • @user2357112 totally confused, can you explain more? for example, explain the standard in your own way? –  Oct 12 '19 at 11:37
  • @reavenisadesk: As is conventional for this kind of grammar, the standard calls the `assignment-expression` nonterminal symbol `assignment-expression` because it'd be far clunkier to write `assignment-expression-or-anything-of-higher-precedence`, or to explicitly name every possible type of expression the nonterminal could expand to. An `assignment-expression` can expand to many different types of expression, but an assignment expression is specifically an expression with an assignment operator in the middle and operands on the left and right. – user2357112 Oct 12 '19 at 11:42
  • @user2357112 hi, as I asked to @EricPostpischil, does any where in standard emphasized this? Or is this an English conventional(English is not my first language)? Or is this a common knowledge so the standard do not point it out specially? Because I read the standard for a really long time(and carefully) but never figured out the assignment expression only means for the second production of 'assignment-expression` –  Oct 12 '19 at 12:26
  • @reavenisadesk The question is not, "Where in the Standard does it say this is an lvalue?" The question is, "Where in the Standard does it say that parentheses force an rvalue?", and the answer is, "Nowhere". There are a number of operators that, in effect, force their operands to be rvalues, but parentheses are *not* one of them. When you parenthesize an lvalue, it stays an lvalue. – Steve Summit Oct 12 '19 at 13:44
  • @reavenisadesk It sounds like your compiler may be separating syntactic and semantic analysis improperly. Parentheses have everything to do with syntactic analysis, and most compiler writers would say, I think, that they have nothing to do with semantics. Parentheses force a particular parse; constrain the construction of the internal data structure (parse tree, etc.) representing a parsed expression. But by the time you're evaluating an expression, and (among other things) deciding when to dereference lvalues into rvalues, you shouldn't care (or even know) where the parentheses were. – Steve Summit Oct 12 '19 at 13:48

3 Answers3

9

To quote n1570 (the last C11 standard draft prior to publication):

6.5.1 Primary expressions (emphasis mine)

5 A parenthesized expression is a primary expression. Its type and value are identical to those of the unparenthesized expression. It is an lvalue, a function designator, or a void expression if the unparenthesized expression is, respectively, an lvalue, a function designator, or a void expression.

i is an lvalue, so per the above so is (i). And to answer your question, the expression (i) = 1 is valid C.

StoryTeller - Unslander Monica
  • 165,132
  • 21
  • 377
  • 458
  • I got problems, let's see the BNF of the parentheses primary expression:`( expression )`, so the type and value are identical to `expression`, and `expression` is an `assignment expression`, and the `assignment expression` is an `primary expression`, simply, `i` will be treat as `assignment expression` at last, so it should follow the rule of `assignment expression` at last, so it should be an rvalue, I think. –  Oct 12 '19 at 09:32
  • 1
    @reavenisadesk - Except it quite clearly, black on white, isn't an rvalue. Or are you saying `i = 1` is invalid too? – StoryTeller - Unslander Monica Oct 12 '19 at 09:34
  • I think `i = 1` is valid because of this: `assignment-expression: unary-expression assignment-operator assignment-expression`, in the case `i = 1`, i is treated as `unary-expression`,which is a lvalue, not `assignment expression`, I think you `i = 1` example is not the case. –  Oct 12 '19 at 09:56
  • @reavenisadesk - Even if your assessment is correct, and I'm not sure so forgive me that. One still has to look *inside* the parentheses to determine type and value category, that's the rule above. The parenthesized expression does not inherit the value category from the production as an assignment expression. – StoryTeller - Unslander Monica Oct 12 '19 at 10:05
  • 2
    @reavenisadesk: The grammar does not say what things are lvalues or rvalues. The semantics do. For an assignment expression, C 6.5.16 3 (just under the heading “Semantics”) says an assignment expression (which is a phrase describing the expression formed with an assignment operator, not an *assignment-expression*, which is a token in the grammar) is not an lvalue. It does not say the *assignment-expression* grammar token is not an lvalue. – Eric Postpischil Oct 12 '19 at 10:32
  • @StoryTeller "The parenthesized expression does not inherit the value category from the production as an assignment expression", any where in the standard can prove this? –  Oct 12 '19 at 11:23
  • 1
    @reavenisadesk - Eric explained it far more eloquently than I can. Furthermore, seems to me you are picking on `(i)` unjustly here. The same point about semantics applies to *every* sort of expression. – StoryTeller - Unslander Monica Oct 12 '19 at 11:25
  • As for the reference, here you go http://port70.net/~nsz/c/c11/n1570.html#5.1.2.3p4 – StoryTeller - Unslander Monica Oct 12 '19 at 11:27
  • @EricPostpischil I just can understand, could you please explain more? –  Oct 12 '19 at 11:32
  • @reavenisadesk: You are correct that, to parse `(i) = 1`, `ii` must cascade through through grammar productions: *identifier* is a *primary-expression*, *primary-expression* is a *postfix-expression*, and so on until a *conditional-expression* is an *assignment-expression*, and *assignment-expression* is an *expression*, which then allows the *expression* `i` to satisfy the production `(` *expression* `)` for a *primary-expression* to form `(i)`. However, the fact that the production that a *comma-expression* is an *assignment-expression* does not mean it is not an lvalue… – Eric Postpischil Oct 12 '19 at 11:37
  • … This is because nothing in the C standard says that this or any other grammatical production rule changes an lvalue into a value. The rule that says an assignment expression is not an lvalue is 6.5.16 3. But it is talking about an “assignment expression”, not about an “*assignment-expression*”. By “assignment expression”, it means tokens of the form *unary-expression* *assignment-operator* *assignment-expression*. It is saying that, if you do an actual assignment, as with `x = 3` or `y += 4`, the result of that is not an lvalue… – Eric Postpischil Oct 12 '19 at 11:40
  • … That rule applies only to actual assignments, which are what the rule means by “assignment expressions”. That rule does not apply to the “pass-through” production that converts a *conditional-expression* to an *assignment-expression*. – Eric Postpischil Oct 12 '19 at 11:41
  • @EricPostpischil correct me if I'm wrong: `assignment expression` only means operations like `x =3` or `y+=4` which memory storage is actually happened due to codes like `unary-expression assignment-operator assignment-expression`, an “assignment-expression” which is formed by only a `conditional-expression` (followed by the first production of assignment-expression)(in this case, the single `i`) does not count an `assignment expression`, right? –  Oct 12 '19 at 12:09
  • @reavenisadesk: Yes, the plain text “assignment expression” is used in the C standard to refer to places where we have actual assignment operators like `=` or `+=`. It does not refer to the grammatical entity *assignment-expression*, and a *conditional-expression* that is an *assignment-expression* is not part of the rule that an assignment expression is not an lvalue. – Eric Postpischil Oct 12 '19 at 12:11
  • Hi @EricPostpischil, does any where in standard emphasized this? Or is this an English conventional(English is not my first language)? Or is this a common knowledge so the standard do not point it out specially? Because I read the standard for a really long time(and carefully) but never figured out the `assignment expression` only means for the second production of 'assignment-expression`. –  Oct 12 '19 at 12:18
  • @reavenisadesk: It is a combination of (a) how English is used, (b) if we mean *assignment-expression*, we write *assignment-expression*, not assignment expression, (c) the other subclauses in clause 6.5 discuss what results their operators produce, so we understand the assignment subclause is also talking about the assignment operators—expressions cascade through the other grammar productions unchanged, and the *assignment-expression* production is no different. – Eric Postpischil Oct 12 '19 at 12:28
  • @EricPostpischil, really thanks, you should create an answer. –  Oct 12 '19 at 12:45
0

StoryTeller explained already where in the standard why for your example, the expression (i) is still an lvalue, but I believe you are being hung up on the spec for no reason so allow me to try to address your concerns.

I checked the code and the rules, and found this: in assignment expression semantics:

An assignment operator shall have a modifiable lvalue as its left operand.

An assignment expression has the value of the left operand after the assignment, but is not an lvalue.

The entire quote is referring to the assignment expression as a whole, not the lhs or rhs.

"An assignment operator shall have a modifiable lvalue as its left operand." states that the lhs must be a modifiable lvalue.

"An assignment expression has the value of the left operand after the assignment, but is not an lvalue." states that the whole assignment expression itself as a result has the value of the lhs and is itself an rvalue.

So the following are all true:

int i;
  i <- modifiable lvalue

(i) = 1;
  (i) <- modifiable lvalue (per StoryTeller's answer)
  1 <- rvalue
  ((i) = 1) <- rvalue

Why is this significant? Consider the following:

int i = 0, j = 0, k = 0;
i = j = k = 1;
// parsed as `i = (j = (k = 1))`
// the expression `k = 1` has the value `1` and is an rvalue
// the expression `j = (k = 1)` has the value `1` and is an rvalue

(i = 2) = 3;
// is invalid, the expression `i = 2` is an rvalue, but it may not be the lhs of the assignment

In my case, the there are two assignment expressions: (i) = 1 and i in parentheses. So the (i) should be a rvalue.

No that is incorrect. (i) = 1 is the only assignment expression. There are two subexpressions (one parenthesized identifier (i) and a numeric constant 1).

Jeff Mercado
  • 129,526
  • 32
  • 251
  • 272
0

This answer is inspired by @Eric Postpischil.

the production of a assignment-expression is:

<assignment-expression> ::= <conditional-expression>
                          | <unary-expression> <assignment-operator> <assignment-expression>

in the standard, the assignment expression specific means expressions with assignment operators. So:

<conditional-expression> is not an assignment expression
<unary-expression> <assignment-operator> <assignment-expression> is an assignment expresssion

so the rule:

An assignment expression has the value of the left operand after the assignment, but is not an lvalue.

only fits for production<unary-expression> <assignment-operator> <assignment-expression>, not for <conditional-expression>

in the example (i) =1, i is an <assignment-expression> but not an assignment expression, it is a <conditional-expression> so it is a lvaule so (i) is a lvalue.