7

The number of tokens in the following C statement.

printf("i = %d, &i = %x", i, &i);

I think there are 12 tokens here. But my answer is wrong.

Can anybody tell me how to find the tokens in the above C statement?

PS: I know that a token is source-program text that the compiler does not break down into component elements.

Gilles 'SO- stop being evil'
  • 104,111
  • 38
  • 209
  • 254
Suraj Menon
  • 1,486
  • 3
  • 28
  • 50
  • 1
    What are the 12 tokens you see there? – Mat Oct 13 '12 at 13:55
  • 6
    I'd say there are 10. `printf` `(` `"i = %d, &i = %x"` `,` `i` `,` `&` `i` `)` `;` – user703016 Oct 13 '12 at 13:57
  • Well, I count ten tokens. To a degree, it depends on how much detail one preserves and how much one ignores. (Could it be that you consider the spaces tokens?) While the C standard requires certain interpretation for the preprocessor, that doesn't have to influence the rest of the parser. –  Oct 13 '12 at 13:57
  • [addresses must be printed by `%p`, not `%x`](https://stackoverflow.com/q/30354097/995714) – phuclv Oct 28 '17 at 16:37
  • i'm so lazy, I wrote a lexer to tell me the answer to this is 10. – MisterGeeky Aug 14 '18 at 06:51

4 Answers4

10

As far as I understand C code parsing, the tokens are (10 in total):

printf
(
"i = %d, &i = %x"
,
i
,
&
i
)
;

I don't count white space, it's generally meaningless and only serves as a separator between other tokens, and I don't break down the string literal into pieces, because it's an integral entity of its own.

Alexey Frunze
  • 61,140
  • 12
  • 83
  • 180
  • then what will be the lexeme of this statement? – Rajesh M Apr 05 '13 at 12:47
  • 1
    @rafanadal What are you talking about? – Alexey Frunze Apr 05 '13 at 12:48
  • kk simply what are lexemes – Rajesh M Apr 06 '13 at 10:48
  • @rafanadal Looks like it's the same thing here. At least, if you don't need to distinguish between the two different stars (unary and binary), the two different pluses and minuses (unary and binary), the two different commas (operator vs separator), the two different ampersands (unary and binary), the two different assignment operators (assignment vs initialization), the differently used parens/braces, etc. – Alexey Frunze Apr 06 '13 at 10:57
4

This looks very much like a school assignment or something, but depending on whether or not whitespace counts: 10 or 12 (or 13, if whitespace counts and there is an ending newline)

'printf' '(' '"i = %d, &i = %x"' ',' 'i' ',' '&' 'i' ')' ';'
  1       2     3                4   5   6   7   8   9  10
perh
  • 1,668
  • 11
  • 14
3

yes totally 10 tokens.Because the characters which are represented in quotes can be treated as single token by the lexical analyser(LA).that is the property of LA.

Ramya
  • 31
  • 1
1

Comments are NOT counted as tokens. White spaces, new-line characters, tabs are also NOT counted as tokens. So, there definitely are 10 tokens.