How does main.cpp see this?

Question

Just started out in C++, so bit of a noob, and not sure why this works. How can main.cpp see to use the print() function contained in the separate print.cpp file? I thought you had to use #include/header files or something like that? I'm using Visual Studio if that helps.

main.cpp

#include "stdafx.h"
#include <iostream>
#include <string>

void print(std::string message);

int main()
{
    std::cout << "Enter message: ";
    std::string message = "";
    std::getline(std::cin, message);
    print(message);
    return 0;
}

print.cpp

#include "stdafx.h"
#include <iostream>
#include <string>

void print(std::string message)
{
    std::cout << "Your message is - " << message << std::endl;
}

You have `void print(std::string message);` declaration in `Main.cpp` so using this function does not cause compiler error. You build and then link `print.cpp` so resulting binary contains function body and so using this function does not cause linker error. — user7860670, Sep 23 '17 at 19:59
It's the `void print(std::string message);` bit that would be better placed in a `print.h` file (after an initial line saying `#pragma once`, as for almost all header files), then `main.cpp` should `#include "print.h"`. That way if some other .cpp file wants to use `print()`, it can include the same header. — Tony Delroy, Sep 23 '17 at 20:01
Informally (I'll let others write out the standardese), you're telling the compiler that there is an external symbol (the `void print(std::string message);` statement) defined in a separate [translation unit](https://stackoverflow.com/questions/1106149/what-is-a-translation-unit-in-c). During the linking process, the said symbol will have to be available in order for the linker to produce a valid executable. — Mihai Todor, Sep 23 '17 at 20:06
A header file is the accepted and recommended place to contain external declarations, and an `#include` directive is similarly the accepted way of bringing said declaration into scope for your translation unit. You don't technically *have* to use them. — n. m. could be an AI, Sep 23 '17 at 20:09

Ray Toal · Answer 1 · 2017-09-23T22:31:11.160

Actually the code in main.cpp does not "see" the print function in print.cpp at all!

The call to print is only checked by the compiler against the incomplete function specification you wrote earlier in the file, not against anything from any other file. C++ allows this incomplete specification as a way to say, "well I'm not telling you how this function is implemented now, but it should be available to you after this file and all other files are compiled and ready to link together, perhaps with some existing libraries."

You mentioned include files. All an include directive does is (among other things) place a bunch of partial function specifications directly inside your program. After including (which runs as a pre-processing phase before the compiler runs), you will have some code that looks just like your main.cpp above. In fact, to the C++ compiler, your code looks no different than one in which your incomplete function specification of print was replaced with an #include directive of a file containing that specification.

An interesting thing about writing incomplete function specifications is that functions implementing those specifications can often be written in different languages, as long as their data types map directly to C++ types. In your case, std::string binds you directly to C++, but had you used int or even char* an external program in assembly language or C could have been used!

Minor nitpick about the last paragraph: That print function could not have been implemented in Fortran, C, or any other language, because those languages do not know what a c++' std::string is. Even if it would, it could be binary incompatible. — André, Sep 23 '17 at 20:17
Thank you! Quite correct. Edited according to your correction. — Ray Toal, Sep 23 '17 at 22:31

score 2 · Answer 2 · answered Sep 23 '17 at 20:31

The reason that you can compile code in separate translation units is linkage: Linkage is the property of a name, and names come in three kinds of linkage, which determine what the name means when it is seen in different scopes:

None: the meaning of a name with no linkage is unique to the scope in which the name appears. For example, "normal" variables declared inside a function have no linkage, so the name i in foo() has a distinct meaning from the name i in bar().
Internal: the meaning of a name with internal linkage is the same inside each translation unit, but distinct across translation units. A typical example are the names of variables declared at namespace scope that are constants, or that appear in an unnamed namespace, or that use the static specifier. For a concrete example, static int n = 10; declared in one .cpp file refers to the same entity in every use of that name inside that file, but a different static int n in a different file refers to a distinct entity.
External: the meaning of a name with external linkage is the same across the entire program. That is, wherever you declare a specific name with external linkage, that name refers to the same thing. This is the default linkage for functions and non-constants at namespace scope, but you can also explicitly request external linkage with the extern specifier. For example, extern int a; would refer to the same int object anywhere in the program.

Now we see how your program fits together (or: "links"): The name print has external linkage (because it's the name of a function), and so every declaration in the program refers to the same function. There's a declaration in main.cpp that you use to call the function, and there's another declaration in print.cpp that defines the function, and the two mean the same thing, which means that the thing you call in main is the exact thing you define in print.cpp.

The use of header files doesn't do any magic: header files are just textually substituted, and now we see precisely what header files are useful for: They are useful to hold declarations of names with external linkage, so that anyone wanting to refer to the entities thus names has an easy and maintainable way of including those declarations into their code.

You could do entirely without headers, but that would require you to know precisely how to declare the names you need, and that is generally not desirable, because the specifics of the declarations are owned by the library owner, not the user, and it is the library owner's responsibility to maintain and ship the declarations.

Now you also see what the purpose of the "linker" part of the translation toolchain is: The linker matches up references to names with external linkage. The linker fills in the reference to the print name in your first translation unit with the ultimate address of the defined entity with that name (coming from the second translation unit) in the final link.

How does main.cpp see this?

main.cpp

print.cpp

2 Answers2