3

Looking at the C grammar, it seems that the input ++i can have 2 derivation: either be treated as the prefix increment operator, or as 2 integer promotion, like +(+i) (same goes for --i).
What am I missing?

unary-expression:
   postfix-expression
   ++ unary-expression
   -- unary-expression
   unary-operator cast-expression
   sizeof unary-expression
   sizeof ( type-name )

unary-operator: one of
    & * + - ~ !

cast-expression:
    unary-expression
    ( type-name ) cast-expression
Shafik Yaghmour
  • 154,301
  • 39
  • 440
  • 740
z̫͋
  • 1,531
  • 10
  • 15
  • @JonathanLeffler so according to the comments by Robert in this [meta question](http://meta.stackoverflow.com/questions/266364/why-was-my-question-marked-duplicate-citing-an-existing-similar-answer) these are not duplicates unless it is an exact dup or the dup is a canonical question/answer. Which I found somewhat surprising a position but the meta questions on dups do have somewhat conflicting answers. I have attempted to get a clarification but none so far. – Shafik Yaghmour Jul 27 '14 at 03:33
  • The basic issue is the maximal munch rule in both cases; they are really about the same issue. Maybe I should have used [What is the name of this operator: "-->"?](http://stackoverflow.com/questions/1642028/what-is-the-name-of-this-operator) instead? (Mostly joking; that's quite a bit diferent.) But there are numerous other possible duplicates for this — all boiling down to the maximal munch rule. – Jonathan Leffler Jul 27 '14 at 03:42
  • @JonathanLeffler well his argument is that the question has to be a dup just having it boil down to the same answer does not make it a dup. Considering he is moderator I have to take the position seriously. – Shafik Yaghmour Jul 27 '14 at 03:51
  • Seems like too rigid a position though. – Shafik Yaghmour Jul 27 '14 at 03:57

2 Answers2

5

The lexer is using the maximal munch principle and will take as many characters as it can to form a valid token to avoid these types of ambiguity.

We can confirm this by going to the draft C99 standard section 6.4 Lexical elements which says:

If the input stream has been parsed into preprocessing tokens up to a given character, the next preprocessing token is the longest sequence of characters that could constitute a preprocessing token. [...]

and it provides two examples:

EXAMPLE 1 The program fragment 1Ex is parsed as a preprocessing number token (one that is not a valid floating or integer constant token), even though a parse as the pair of preprocessing tokens 1 and Ex might produce a valid expression (for example, if Ex were a macro defined as +1). Similarly, the program fragment 1E1 is parsed as a preprocessing number (one that is a valid floating constant token), whether or not E is a macro name.

and

EXAMPLE 2 The program fragment x+++++y is parsed as x ++ ++ + y, which violates a constraint on increment operators, even though the parse x ++ + ++ y might yield a correct expression.

Shafik Yaghmour
  • 154,301
  • 39
  • 440
  • 740
3

According to the C Standard

4 If the input stream has been parsed into preprocessing tokens up to a given character, the next preprocessing token is the longest sequence of characters that could constitute a preprocessing token.

So there is no ambiguity.

For example in this program

#include <stdio.h>

int main( void ) 
{

    int a = 1;
    int b = 10;
    int c = a+++b;

    printf( "c = %d\n", c ); 
}   

The output will be

11

because expression

a+++b

will be interpretated as

a++ + b

not as

a + ++b
Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335