1

TLDR: See the last paragraph of this question.

I'm a Computer Science student trying to finish the text of my master thesis about creating a transpiler (case study).

Now for this master thesis, a part of my text is about comparing the languages involved. One of the languages is C++.

Now I'm trying to explain the difference in import/include semantics and the historical reason why C++ did it that way. I know how it works in C/C++, so I don't really need a technical explanation.

Researching extensively on Google and Stackoverflow I came up with several stackoverflow explanations and other references on this topic:

Why are forward declarations necessary?

What are forward declarations in C++?

Why does C++ need a separate header file?

http://en.wikipedia.org/wiki/Include_directive

http://www.cplusplus.com/forum/articles/10627/

https://softwareengineering.stackexchange.com/questions/180904/are-header-files-actually-good

http://en.wikipedia.org/wiki/One-pass_compiler

Why have header files and .cpp files in C++?

And last but not least the book "Design and Evolution of C++ (1994)" of Bjarne Stroustrup (page 34 - 35).

If I understand correctly this way of doing imports/includes came from C and came to be because of the following reasons:

  • Computers were not as fast so a One pass compiler was preferable. The only way this was possible is by enforcing the declaration before use idiom. This is because C and C++ are programming languages that have a context-sensitive grammar: they need the right symbols to be defined in the symbol table in order to disambiguate some of the rules. This is opposed to modern compilers: nowadays a first pass is usually done to construct the symbol table and sometimes (in case the language has a context-free grammar) the symbol table is not required in the parsing stage because there are no ambiguities to resolve.

  • Memory was very limited and expensive in those days. Therefore it was not feasible to store a whole symbol table in memory in most computers. That's why C let programmers forward declare the function prototypes and global variables they actually needed. Headers were created to enable developers to keep those declarations centralized so they could easily be reused across modules that required those symbols.

  • Header files were a useful way to abstract interface from implementation

  • C++ tried to establish backwards compatibility with software and softwarelibraries written in C. More importantly: they actually used to transpile to C (CFront) and then using a C Compiler to compile the code into machine code. This also enabled them to compile to a lot of different platforms right from the start, as each of those platforms already had a C compiler and C linker.

The above was an illustration of what I discovered by searching first ;) The problem is: I can't find a suitable reference to the historical reasons for this include strategy, aside from here on Stackoverflow. And I highly doubt my university will be happy with a stackoverflow link. The closest I've come is the "Design and Evolution of C++" reference, but it doesn't mention the hardware limitations being a reason for the include strategy. I think that's to be expected because the design of the feature came from C. Problem is that I didn't find any good source yet that describes this design decision in C, preferably with the hardware limitations in mind.

Can anyone point me in the good direction?

Thanks!

Community
  • 1
  • 1
Moonsurfer_1
  • 105
  • 1
  • 7
  • 1
    Your first link explains it clearly. As for finding reference, I am not sure if you could find a proper reference for every decision made in the early 1970s... but good luck :) – P.P Jun 09 '14 at 10:49
  • I know, that's where most of my explanation came from ;) But I doubt that my university will accept that reference. That's why I'm asking. But thanks anyway for answering ;) I really hope I can still find one. – Moonsurfer_1 Jun 09 '14 at 10:51

1 Answers1

2

You're right that the reason C++ does it this way is because C did it this way. The reason C did it this was is also based in history; in the very beginning (B), there were no declarations. If you wrote f(), then the compiler assumed that f was a function somewhere. Which returned a word, since everything in B was a word; there were no types. When C was invented (to add types, since everything is a word isn't very efficient with byte addressed machines), the basic principle didn't change, except that the function was assumed to return int (and to take arguments of the type you gave it). If it didn't return int, then you had to forward declare it with the return type. In the earlier days of C, it wasn't rare to see applications which didn't use include, and which simply redeclared e.g. char* malloc() in each source file that used malloc. The preprocessor was developed to avoid having to retype the same thing multiple times, and at the very beginning, its most important feature was probably #define. (In early C, all of the functions in <ctype.h>, and the character based IO in <stdio.h> were macros.)

As for why the declaration needed to preceed the use: the main reason is doubtlessly because if it didn't the compiler would assume an implicit declaration (function returning int, etc.). And at the time, compilers were generally one pass, at least for the parsing; it was considered too complicated to go back at "correct" an assumption that had already been made.

Of course, in C++, the language isn't constrained as much by this; C++ has always required functions to be declared, for example, and in certain contexts (in class member functions, for example), doesn't require the declaration to precede the use. (Generally, however, I would consider in class member functions to be a misfeature, to be avoided for readability reasons. The fact that the function definitions must be in the class in Java is a major reason not to use that language in large projects.)

James Kanze
  • 150,581
  • 18
  • 184
  • 329
  • 1
    I believe OP fairly knows the *why* bit. OP's just asking for an authoritative reference for the same (e.g. a Dennis Ritchie paper or similar). – P.P Jun 09 '14 at 11:06
  • Yes, Blue moon is right, this answer is not really what I'm looking for, but it's interesting nonetheless. I knew that C assumes int return type if no return type was specified, yet I didn't know that the C compiler would not complain about undefined symbols. That seems very odd to me. Thanks for your answer, I'll look into this further ;) It might be something I could mention in my text as well. – Moonsurfer_1 Jun 09 '14 at 11:41
  • 1
    @Moonsurfer_1 It will (and always would) complain about undeclared variables, although even there, "implicit int" ruled. (E.g. you could write `static a;`, and it would be interpreted as `static int a;`.) And I'll admit that there's not much you could cite in a university paper here---it's more personal reminiscence. Triggered mostly by the claims it was due to slow machines or whatever; it was just more or less the standard way of doing things back then (and ruled even in languages like Pascal, which didn't have any implicit declarations). – James Kanze Jun 09 '14 at 12:30