Why +++x will be divided into ++(+x) instead of +(++x) in C++?

Question

When I type this code bellow

int x = 1;
+++x;

it would be divided into ++(+x), and of course the sentence is wrong cause there's a rvalue after ++.

I am curious about why it can not be +(++x), in which the code is correct.

Is this depend on the IDE or the compiler ?

Can it be find in C++ Standard ? Or it's just a undefined behaviour ?

Thanks a lot to answer this question and forgive my poor English.

Check [operator precedence rules](https://en.cppreference.com/w/cpp/language/operator_precedence). — πάντα ῥεῖ, Apr 19 '22 at 16:52
It's known as the "maximum munch" rule. Since `++` is a valid token, the compiler eats the first two `+`s and treats them as the `++` operator. It doesn't go back after it detects a problem to try and see if there was something different it could have done. — Pete Becker, Apr 19 '22 at 16:55
Do not tag both C and C++ except for questions about differences or interactions between the two languages. — Eric Postpischil, Apr 19 '22 at 16:58

score 7 · Answer 1 · edited Apr 19 '22 at 17:20

From C++20 (draft N4860) [lex.pptoken]/3.3

— Otherwise, the next preprocessing token is the longest sequence of characters that could constitute a preprocessing token, even if that would cause further lexical analysis to fail, ...

and [lex.pptoken]/6

[Example: The program fragment x+++++y is parsed as x ++ ++ + y, which, if x and y have integral types, violates a constraint on increment operators, even though the parse x ++ + ++ y might yield a correct expression. —end example]

So, it is a rule of the language, that the + goes with the variable, because the ++ is first grouped together.

Funnily, this reminds me of an old problem where: std::vector<std::vector<int>> a used to cause problems because >> would be one token instead of two (since it's supposed to be the longest sequence of characters). This is addressed by [temp.names]/3

When a name is considered to be a template-name, and it is followed by a <, the < is always taken as the delimiter of a template-argument-list and never as the less-than operator. When parsing a template-argumentlist, the first non-nested > is taken as the ending delimiter rather than a greater-than operator. Similarly, the first non-nested >> is treated as two consecutive but distinct > tokens, the first of which is taken as the end of the template-argument-list and completes the template-id. [Note: The second > token produced by this replacement rule may terminate an enclosing template-id construct or it may be part of a different construct (e.g., a cast). —end note]

Jason · Answer 2 · 2022-04-19T17:02:21.853

2

This is a consequence of the maximum munch tokenization principle:

A C++ implementation must collect as many consecutive characters as possible into a token.

From lex.pptoken#3.3:

Otherwise, the next preprocessing token is the longest sequence of characters that could constitute a preprocessing token, even if that would cause further lexical analysis to fail, except that a header-name is only formed within a #include directive.

And since ++ is the longest valid token, the parser treats the expression as if ++ +x.

edited Apr 19 '22 at 17:02

answered Apr 19 '22 at 16:58

Jason

36,170
5
26
60

4

This answer quotes text without saying where it is from. If it is not authoritative text, it is of limited value. – Eric Postpischil Apr 19 '22 at 17:00

Why +++x will be divided into ++(+x) instead of +(++x) in C++?

2 Answers2