1

I was convinced that a translation unit is a .cpp file (or, to avoid referring to an extension, a file you would feed to `g++ -c theTranslationUnit.cpp -o whatever.o) once you substituted into it the macros, copied and pasted the #includes (recursively), and removed the comments.

In other words, I was thinking of it as "take a C++ file and process all the #s and delete all the comments in it".

However, I've recently found this very clear answer about what are the step that GCC performs, and I experimented with those info, finding out that the typical output of g++ -E someSource.cpp looks like this

# 0 "main.cpp"
# 0 "<built-in>"
# 0 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 0 "<command-line>" 2
# 1 "main.cpp"
# 1 "Foo.hpp" 1
struct Foo {
};
# 2 "main.cpp" 2
int main() {
}

which I can farily easly understand what it is, but…

  • is it valid C++ code? Clearly I can feed it to g++, but that's, I believe, just because it can recognize it and process is accordingly, e.g. skipping the preprocessing step.
  • Is it the thing known as translation unit? As in, with all those #-lines?
Enlico
  • 23,259
  • 6
  • 48
  • 102
  • This may help - https://stackoverflow.com/questions/15679756/g-e-option-output – Richard Critten Sep 18 '22 at 15:31
  • http://eel.is/c++draft/lex.phases#footnote-9 http://eel.is/c++draft/lex.phases#note-3 – Language Lawyer Sep 18 '22 at 15:33
  • 1
    output of preprocessor is implementation-specific. There is no guarantee that it can be compiled. Also -E option is primarily a test tool, e.g. GCC's -E option preserves some preprocessor directives in source – Swift - Friday Pie Sep 18 '22 at 15:41
  • Translation unit: https://stackoverflow.com/questions/1106149/what-is-a-translation-unit-in-c – Swift - Friday Pie Sep 18 '22 at 15:42
  • @Swift-FridayPie, I consulted that link right before asking. I should have probably linked it myself from the question. But since none made mention of `-E` or equivalent options, I wanted to ask. – Enlico Sep 18 '22 at 15:44
  • 1
    @enlico: A translation unit is basically produced at translation phase 7. At that point, the TU is a list of tokens, without whitespace, which is a different data structure than a character string. ("The resulting tokens are syntactically and semantically analyzed and translated as a translation unit." (5.2 [lex.phases], para 1.7.) In fact, as that same clause notes, "...translation units... need not necessarily be stored as files, nor need there be any one-to-one correspondence between these entities and any external representation." `g++ -E` is, therefore, not covered by the standard. – rici Sep 25 '22 at 03:43

1 Answers1

1

is it valid C++ code?

It's not a "strictly conforming" C++ code.

There is only that many # https://eel.is/c++draft/gram.cpp preprocessor directives. # <number> "source" <number> falls into # conditionally-supported-directive case, in which case https://eel.is/c++draft/cpp.pre#2 :

A conditionally-supported-directive is conditionally-supported with implementation-defined semantics.

It happens to be conforming C++ code that may be accepted by a conforming C++ compiler that supports this semantic.

Is it the thing known as translation unit? As in, with all those #-lines?

Yes, it is a translation unit. It's a segment of text that is an input to the compiler (translator).


It's really not relevant. C++ places no requirements on the output of gcc -E, gcc can do what he wants here and output what he wants. It's not relevant from C++ standard point of view nor from gcc point of view, if this is valid C++ code or not. This is internal gcc output for gcc use. You may be interested in How to remove lines added by default by the C preprocessor to the top of the output? .

KamilCuk
  • 120,984
  • 8
  • 59
  • 111
  • 1
    _Yes, it is a translation unit. It's a segment of text that is an input to the compiler (translator)._ Have you tried to check the definition of translation unit? It is a bit different. Lol. – Language Lawyer Sep 21 '22 at 12:30