The #
line
preprocessor directive described in the Stack Overflow question you link to is a standard C directive for setting the compiler’s notion of the current source file and line number. It could be used for conveying this information through preprocessing so that, after preprocessing, the compiler still has information about the origins of lines of code. It may also be used by other tools that process or produce source code, such as YACC or Lex, to provide information about where code found in their output originated in their input files.
However, GCC uses its own non-standard mechanism to convey this and additional information. In the preprocessor output you show, the non-standard directive # 1 "main.c"
is essentially equivalent to the standard directive # line 1 "main.c"
; both say that the following line came from line 1 of the file “main.c”.
Thus, the origin line information is completely visible in the preprocessor output you show.
However, the GCC form allows additional information. In these lines:
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 0 "<command-line>" 2
the trailing “1 3 4” means this is the start of a new file (1), it comes from a system header so certain warnings should be suppressed (3), and it should be treated as wrapped in an extern "C"
block (4). The trailing “2” means it is returning to a prior file after having included another file. (Apparently the inclusion of “/usr/include/stdc-predef.h” resulted in no lines of code, possibly because the file was completely wrapped with a #if
… #endif
pair that was not activated.)
… when its preprocessor has removed comments?
When the GCC preprocess removes comments, it leaves the new-line characters in, so the line spacing remains unchanged. For example, in processing the input:
abc
/* Multiple-line comment
consisting of
three lines */
xyz
the preprocessor produces:
abc
xyz
So the output has the same number of lines as the input. So line numbers remain correct after preprocessing. However, column information is not conveyed in this way. Consider this code:
int foo/*comment*/(nuts);
When I compile it with Clang 11.0.0, the error message is:
x.c:1:20: error: a parameter list without types is only allowed in a function
definition
int foo/*comment*/(nuts);
^
As we can see, the compiler knows the error begins in column 20. However, when I preprocess it with clang -E x.c >x.i
and then compile the resulting x.i
file, the error message is:
x.c:1:10: error: a parameter list without types is only allowed in a function
definition
int foo (nuts);
^
This demonstrates that the column information is not contained in the preprocessor output. Therefore, we can conclude the compiler maintains this information internally when it is doing both the preprocessing and the compilation. In modern GCC and Clang, preprocessing is integrated into the compilation; it is not actually a separate processing step.
Another way to see that preprocessing is integrated into compilation is to compile this code:
int foo(nuts);
#error "Stop processing."
If preprocessing were a separate step prior to compilation, the #error
directive would cause a message to be printed and would cause the process to exit. However, when this is compiled with Clang, the compiler first prints a message about the int foo(nuts)
line and then prints the message for the #error
line. This shows the preprocessing is intertwined with the compilation; the preprocessing is being done line-by-line in concert with compilation, so the compiler does not reach the #error
directive until it has already processed the prior int foo(nuts);
line.