3

When I type this code bellow

int x = 1;
+++x;

it would be divided into ++(+x), and of course the sentence is wrong cause there's a rvalue after ++.

I am curious about why it can not be +(++x), in which the code is correct.

Is this depend on the IDE or the compiler ?

Can it be find in C++ Standard ? Or it's just a undefined behaviour ?

Thanks a lot to answer this question and forgive my poor English.

user4581301
  • 33,082
  • 7
  • 33
  • 54
Bella Wang
  • 39
  • 3
  • 2
    Check [operator precedence rules](https://en.cppreference.com/w/cpp/language/operator_precedence). – πάντα ῥεῖ Apr 19 '22 at 16:52
  • 7
    It's known as the "maximum munch" rule. Since `++` is a valid token, the compiler eats the first two `+`s and treats them as the `++` operator. It doesn't go back after it detects a problem to try and see if there was something different it could have done. – Pete Becker Apr 19 '22 at 16:55
  • 1
    Do not tag both C and C++ except for questions about differences or interactions between the two languages. – Eric Postpischil Apr 19 '22 at 16:58

2 Answers2

7

From C++20 (draft N4860) [lex.pptoken]/3.3

— Otherwise, the next preprocessing token is the longest sequence of characters that could constitute a preprocessing token, even if that would cause further lexical analysis to fail, ...

and [lex.pptoken]/6

[Example: The program fragment x+++++y is parsed as x ++ ++ + y, which, if x and y have integral types, violates a constraint on increment operators, even though the parse x ++ + ++ y might yield a correct expression. —end example]

So, it is a rule of the language, that the + goes with the variable, because the ++ is first grouped together.

Funnily, this reminds me of an old problem where: std::vector<std::vector<int>> a used to cause problems because >> would be one token instead of two (since it's supposed to be the longest sequence of characters). This is addressed by [temp.names]/3

When a name is considered to be a template-name, and it is followed by a <, the < is always taken as the delimiter of a template-argument-list and never as the less-than operator. When parsing a template-argumentlist, the first non-nested > is taken as the ending delimiter rather than a greater-than operator. Similarly, the first non-nested >> is treated as two consecutive but distinct > tokens, the first of which is taken as the end of the template-argument-list and completes the template-id. [Note: The second > token produced by this replacement rule may terminate an enclosing template-id construct or it may be part of a different construct (e.g., a cast). —end note]

Evg
  • 25,259
  • 5
  • 41
  • 83
ChrisMM
  • 8,448
  • 13
  • 29
  • 48
2

This is a consequence of the maximum munch tokenization principle:

A C++ implementation must collect as many consecutive characters as possible into a token.

From lex.pptoken#3.3:

Otherwise, the next preprocessing token is the longest sequence of characters that could constitute a preprocessing token, even if that would cause further lexical analysis to fail, except that a header-name is only formed within a #include directive.

And since ++ is the longest valid token, the parser treats the expression as if ++ +x.

Jason
  • 36,170
  • 5
  • 26
  • 60