0

If my program has a string s1= "like/*this" , s2="like /*this is a comment */this" and s3 = "like //this is not a comment" In s1 and s3, "/" and "//*" are part of the string. In s2, It is a comment for the users to be displayed on the output screen. What algorithm does the c/c++ compiler use for this? (My guess is, the compiler just ignores all text inside "")

2 Answers2

4

No, inside strings there are no comments, all the characters are part of the string. From the C standard, chapter 6.4.9 (Comments):

Except within a character constant, a string literal, or a comment, the characters /* introduce a comment. The contents of such a comment are examined only to identify multibyte characters and to find the characters */ that terminate it.

Then a similar rule for the // comments.

Also, there is a nice foot-note clarifying that since the /* is not recognized inside a comment, comments do not nest.

About the algorithm used by compilers... well, when tokenizing the input file, the compiler knows if it is inside a string or not (it must know its own state), is it is easy to switch to comment mode or not.

rodrigo
  • 94,151
  • 12
  • 143
  • 190
3

It is the lexical analysis of the compiler. For C, it is tied to the preprocessing (so look into the libcpp/ directory of GCC source code). Read more about parsing & abstract syntax trees.

You should read the Dragon Book which gives an overview of compilation techniques (we cannot explain them here in a few sentences).

Lexical analysis is often done using finite automaton (corresponding to regular expressions) techniques. In many cases, you could generate the lexical analyzer, e.g. using flex. Syntax analysis can also be generated, e.g. using bison or ANTLR (related to stack automaton).

(BTW, current GCC 6 & 7 are using hand-written lexers and parsers -instead of generating them with e.g. flex & bison -: first to manage a lot of extra information such as source position, how something has been macro-expanded; also, for better error messages; and perhaps for efficiency)

If you want detailed explanations about GCC, my MELT documentation web page contain a lot of references. Look also at the GCC internals documentation and of course download and study the source code of GCC. See also this.

Community
  • 1
  • 1
Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547