I want to know number of tokens in the statement given below
a+++b---c
Please tell me number of tokens I told my viva teacher that there are 7 tokens but he said it is wrong.
I want to know number of tokens in the statement given below
a+++b---c
Please tell me number of tokens I told my viva teacher that there are 7 tokens but he said it is wrong.
You are correct. There are seven tokens: (in C)
a
++
+
b
--
-
c
According to the C standard (pre-C11 draft): http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf
6.4 Lexical elements
3 A token is the minimal lexical element of the language in translation phases 7 and 8. The categories of tokens are: keywords, identifiers, constants, string literals, and punctuators. ...
4 If the input stream has been parsed into preprocessing tokens up to a given character, the next preprocessing token is the longest sequence of characters that could constitute a preprocessing token. ... 6 EXAMPLE 2 The program fragment
x+++++y
is parsed asx ++ ++ + y
, which violates a constraint on increment operators, even though the parsex ++ + ++ y
might yield a correct expression6.4.6 Punctuators .. punctuator: one of
++
--
...+
-
So, https://en.wikipedia.org/wiki/Maximal_munch rule is used as was noted in https://stackoverflow.com/a/7485174 and comment about a+++b
fragment: What does the operation c=a+++b mean? Shahbaz Sep 20 2011: "the lexer of C and C++, try to match the biggest string they can when they see something. ... Therefore, when the lexer sees the first plus, it tries the next character, it sees that it can match both characters as a ++
, then continues on to see the next +
. Hence, the parser sees a ++ + b
"
While gcc and clang have complex code and may mix different translation phases of standard in the single code samples (so they are not best guides to the language, as rici said), we may check parsing implementation for ++
and --
. When it sees char +
it may generate different tokens based on what is the next char, if is +
too, emit plusplus token, otherwise emit plus token:
http://code.metager.de/source/xref/llvm/clang/lib/Lex/Lexer.cpp#3264
3264 case '+':
3265 Char = getCharAndSize(CurPtr, SizeTmp);
3266 if (Char == '+') {
3267 CurPtr = ConsumeChar(CurPtr, SizeTmp, Result);
3268 Kind = tok::plusplus;
3269 } else if (Char == '=') {
3270 CurPtr = ConsumeChar(CurPtr, SizeTmp, Result);
3271 Kind = tok::plusequal;
3272 } else {
3273 Kind = tok::plus;
3274 }
3275 break;
http://code.metager.de/source/xref/gnu/gcc/libcpp/lex.c#2633
2633 case '+':
2634 result->type = CPP_PLUS;
2635 if (*buffer->cur == '+')
2636 buffer->cur++, result->type = CPP_PLUS_PLUS;
2637 else if (*buffer->cur == '=')
2638 buffer->cur++, result->type = CPP_PLUS_EQ;
2639 break;
So, tokens of a+++b---c
expression are a
++
+
b
--
-
c
. Advisor may say you are wrong, but only wanting you to explain why you think you counted 7. And if the question is the same task as given and it is parsed according to C standard (or C++ which is same lexing for this example), you can explain your answer and show him relevant parts of language standards.