4

How can I understand the parsing of expressions like

a = b+++++b---c--;

in C?

I just made up the expression above, and yes, I can check the results using any compiler, but what I want to know is the ground rule that I should know to understand the parsing of such expressions in C.

Moeb
  • 10,527
  • 31
  • 84
  • 110
  • 1
    @Martin: if you can get 6.2/4 removed from the C standard as "who cares", then you can close this question "who cares" ;-p Daft examples can illustrate fundamentals. – Steve Jessop Oct 24 '10 at 23:03
  • 1
    Possible duplicate of [Why doesn't a+++++b work in C?](https://stackoverflow.com/questions/5341202/why-doesnt-ab-work-in-c) – phuclv Aug 13 '17 at 11:31

3 Answers3

5

From the standard 6.2(4):

If the input stream has been parsed into preprocessing tokens up to a given character, the next preprocessing token is the longest sequence of characters that could constitute a preprocessing token.

They even add the example:

EXAMPLE 2 The program fragment x+++++y is parsed as x ++ ++ + y, which violates a constraint on increment operators, even though the parse x ++ + ++ y might yield a correct expression.

So your statement:

a = b+++++b---c--; 

Is equivalent to:

a = b ++ ++ + b -- - c -- ;
Dingo
  • 3,305
  • 18
  • 14
  • 3
    That lexes it. Depending what the questioner means by "parse", you also need to use the operator precedence rules to construct an expression tree. – Steve Jessop Oct 23 '10 at 17:16
  • 2.1.1.2 Translation phases 7. [...] Preprocessing tokens are converted into tokens. [...] http://flash-gordon.me.uk/ansi.c.txt – starblue Oct 23 '10 at 17:25
  • 2
    Given the lack of spaces, I assume he means parsing into tokens. There isn't a valid expression tree for his example, anyway. – Dingo Oct 23 '10 at 19:19
  • "I assume he means parsing into tokens". Yes, that is what I meant. – Moeb Oct 28 '10 at 16:06
2

The operators involved are ++, --, + and -. Some parantheses and spaces will help here:

a = ((b++)++) + (b--) - (c--);

I don't know how parsing works exactly, but there's no ambiguity involved (OK, there is, see Dingo's answer), so I guess it could be done with some simple rules like:

  • One or more characters make a variable name, the most simple type of "expression"
  • Operators + and - combine two "expressions"
  • Operators ++ and -- are a suffix to an "expression"

To remove the ambiguity, you can give ++ and -- a higher priority than + and -.

schnaader
  • 49,103
  • 10
  • 104
  • 136
1

I do know know how much are you familiar with parsers, so just in case: http://en.wikipedia.org/wiki/LL_parser

If you need a formal grammar description, take a look at description for parser generator: https://javacc.dev.java.net/servlets/ProjectDocumentList?folderID=110

ika
  • 1,889
  • 2
  • 12
  • 9