13

T.C. left an interesting comment to my answer on this question:

Why aren't include guards in c++ the default?

T.C. states:

There's "header" and there's "source file". "header"s don't need to be actual files.

What does this mean?

Perusing the standard, I see plenty of references to both "header files" and "headers". However, regarding #include, I noticed that the standard seems to make reference to "headers" and "source files". (C++11, § 16.2)

A preprocessing directive of the form
    # include < h-char-sequence> new-line
searches a sequence of implementation-defined places for a header identified uniquely
by the specified sequence between the < and > delimiters, and causes the replacement
of that directive by the entire contents of the header. How the places are specified
or the header identified is implementation-defined.

and

A preprocessing directive of the form
    # include " q-char-sequence" new-line
causes the replacement of that directive by the entire contents of the source *file*
identified by the specified sequence between the " delimiters. The named source *file*
is searched for in an implementation-defined manner.

I don't know if this is significant. It could be that "headers" in a C++ context unambiguously means "header files" but the word "sources" would be ambiguous so "headers" is a shorthand but "sources" is not. Or it could be that a C++ compiler is allowed leeway for bracket includes and only needs to act as if textual replacement takes place.

So when are header (files) not files?

The footnote mentioned by T.C. in the comments below is quite direct:

174) A header is not necessarily a source file, nor are the sequences delimited by < and > in header names necessarily valid source file names (16.2).

Community
  • 1
  • 1
Praxeolitic
  • 22,455
  • 16
  • 75
  • 126
  • 1
    The only two instances of "header file" I can find are both in Annex C, which (a) is informative, not normative and (b) clearly using it in the "normal" and not standardese sense. – T.C. Dec 03 '14 at 00:27
  • 3
    The standard headers are part of the implementation, and found in an implementation-defined manner. They could, in theory, be part of the compiler (and not files in the filesystem), though I'm not aware of any implementation which takes advantage of that licence. It's more practical to have them as physical, readable files, which a programmer can study easily. – Deduplicator Dec 03 '14 at 00:28
  • You're right. I overstated the prevalence of "header file". – Praxeolitic Dec 03 '14 at 00:29
  • 1
    Anyway, *header* is defined in §17.6.1.2 [headers]/p1, and there's an accompanying footnote that says "A header is not necessarily a source file, nor are the sequences delimited by `< `and `>` in header names necessarily valid source file names (16.2).". – T.C. Dec 03 '14 at 00:36

3 Answers3

15

For the standard header "files" the C++ standard doesn't really make a mandate that the compiler uses a file or that the file, if it uses one, actually looks like a C++ file. Instead, the standard header files are specified to make a certain set of declarations and definitions available to the C++ program.

An alternative implementation to a file could be a readily packaged set of declarations represented in the compiler as data structure which is made available when using the corresponding #include-directive. I'm not aware of any compiler which does exactly that but clang started to implement a module system which makes the headers available from some already processed format.

Dietmar Kühl
  • 150,225
  • 13
  • 225
  • 380
  • 1
    Do these possibilities actually get discussed among people involved with the standard or do you just think up these diabolical ideas on the fly when it comes up on SO? – Praxeolitic Dec 03 '14 at 00:35
  • @Praxeolitic: The linked page references the C++ committee's module-proposal. – Deduplicator Dec 03 '14 at 00:42
  • @Praxeolitic: such things have been discussed in comp.std.c and comp.std.c++ for decades (and Dietmar probably participated in some of those discussions, though Google's current search capability for Usenet makes it difficult to verify that). – Jerry Coffin Dec 03 '14 at 00:43
  • @Praxeolitic: there was a version of IBM's C++ compiler which used a Smalltalk-like representation of the program consisting of a database of declarations (I'm not sure if that was ever released to people outside IBM, though). There was also an initial version of an open source compiler called [TenDRA](http://en.wikipedia.org/wiki/TenDRA_Compiler), which received its declarations for the standard library by something harder to write but faster to parse format. ... and, of course, [modules](http://clang.llvm.org/docs/Modules.html) are being discussed for standardization. – Dietmar Kühl Dec 03 '14 at 00:46
7

They do not have to be files, since the C and C++ preprocessor are nearly identical it is reasonable to look into the C99 rationale for some clarity on this. If we look at the Rationale for International Standard—Programming Languages—C it says in section 7.1.2 Standard headers says (emphasis mine):

In many implementations the names of headers are the names of files in special directories. This implementation technique is not required, however: the Standard makes no assumptions about the form that a file name may take on any system. Headers may thus have a special status if an implementation so chooses. Standard headers may even be built into a translator, provided that their contents do not become “known” until after they are explicitly included. One purpose of permitting these header “files” to be “built in” to the translator is to allow an implementation of the C language as an interpreter in a free-standing environment where the only “file” support may be a network interface.

Community
  • 1
  • 1
Shafik Yaghmour
  • 154,301
  • 39
  • 440
  • 740
3

It really depends on the definition of files.

If you consider any database which maps filenames to contents to be a filesystem, then yes, headers are files. If you only consider files to be that which is recognized by the OS kernel open system call, then no, headers don't have to be files.

They could be stored in a relational database. Or a compressed archive. Or downloaded over the network. Or stored in alternate streams or embedded resources of the compiler executable itself.

In the end, though, textual replacement is done, and the text comes from some sort of indexed-by-name database.

Dietmar mentioned modules and loading already processed content... but this is generally NOT allowable behavior for #include according to the C++ standard (modules will have to use a different syntax, or perhaps #include with a completely new quotation scheme other than <> or ""). The only processing that could be done in advance is tokenization. But contents of headers and included source files are subject to stateful preprocessing.

Some compilers implement "precompiled headers" which have done more processing than mere tokenization, but eventually you find some behavior that violates the Standard. For example, in Visual C++:

The compiler ... skips to just beyond the #include directive associated with the .h file, uses the code contained in the .pch file, and then compiles all code after filename.

Ignoring the actual source code prior to #include definitely does not conform to the Standard. (That doesn't prevent it from being useful, but you need to be aware that edits may not produce the expected behavior changes)

Ben Voigt
  • 277,958
  • 43
  • 419
  • 720