60

When I work on my personal C and C++ projects I usually put file.h and file.cpp in the same directory and then file.cpp can reference file.h with a #include "file.h" directive.

However, it is common to find out libraries and other kinds of projects (like the linux kernel and freeRTOS) where all .h files are placed inside an include/ directory, while .cpp files remain in another directory. In those projects, .h files are also included with #include "file.h" instead of #include "include/file.h" as I was hoping.

I have some questions about all of this:

  1. What are the advantages of this file structure organization?
  2. Why are .h files inside include/ included with #include "file.h" instead of #include "include/file.h"? I know the real trick is inside some Makefile, but is it really better to do that way instead of making clear (in code) that the file we want to include is actually in the include/ directory?
muru
  • 4,723
  • 1
  • 34
  • 78
  • When so organized, `include` is the root of includes, therefore it is not part of the actual include name. Usually - but not always - you'll also have source files in a `src` directory, btw. – spectras Feb 01 '18 at 03:15
  • This way is more flexible, you can change the location of header files and only have to update that fact in one place, imagine the insanity of having to go through all the files in the kernel that use some library if you decided to change the directory structure and had to modify the includes one by one. This also saves a lot of typing. – Raul Sauco Feb 01 '18 at 03:19
  • 3
    Can you imagine what it would be like if each system header had to be found in a different directory? Chaos doesn’t begin to describe it. That’s why they’re collected into one, or a very few, directories. – Jonathan Leffler Feb 01 '18 at 03:19

3 Answers3

56

The main reason to do this is that compiled libraries need headers in order to be consumed by the eventual user. By convention, the contents of the include directory are the headers exposed for public consumption. The source directory may have headers for internal use, but those are not meant to be distributed with the compiled library.

So when using the library, you link to the binary and add the library's include directory to your build system's header paths. Similarly, if you install your compiled library to a centralized location, you can tell which files need to be copied to the central location (the compiled binaries and the include directory) and which files don't (the source directory and so forth).

Michał Łoś
  • 633
  • 4
  • 15
Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • 1
    To add to this... It's a way of giving you modularisation and design-by-contract. Typically the files in the include directory will also contain comments documenting the library behaviour (Doxygen is popular for this). C# has the concept of partial class definition which makes it easier to keep the implementation hidden, but you still need to decide what should be the external interface and what should be purely internal to the library. – Graham Feb 01 '18 at 10:47
14

It used to be that <header> style includes were of the implicit path type, that is, to be found on the includes environment variable path or a build macro, and the "header" style includes were of the explicit form, as-in, exactly relative to where-ever the source file is that included it. While some build tool chains still allow for this distinction, they often default to a configuration that effectively nullifies it.

Your question is interesting because it brings up the question of which really is better, implicit or explicit? The implicit form is certainly easier because:

  1. Convenient groupings of related headers in hierarchies of directories.
  2. You only need include a few directories in the includes path and need not be aware of every detail with regard to exact locations of files. You can change versions of libraries and their related headers without changing code.
  3. DRY.
  4. Flexible! Your build environment doesn't have to match mine, but we can often get nearly exact same results.

Explicit on the other hand has:

  1. Repeatable builds. A reordering of paths in an includes macro/environment variable, doesn't change resulting header files found during the build.
  2. Portable builds. Just package everything from the root of the build and ship it off to another dev.
  3. Proximity of information. You know exactly where the header is with #include "\X\Y\Z". In the implicit form, you may have to go searching along multiple paths and might even find multiple versions of the same file, how do you know which one is used in the build?

Builders have been arguing over these two approaches for many decades, but a hybrid form of the two, mostly wins out because of the effort required to maintain builds based purely of the explicit form, and the obvious difficulty one might have familiarizing one's self with code of a purely implicit nature. We all generally understand that our various tool chains put certain common libraries and headers in particular locations, such that they can be shared across users and projects, so we expect to find standard C/C++ headers in one place, but we don't initially know anything about the specific structure of any arbitrary project, lacking a locally well documented convention, so we expect the code in those projects to be explicit with regard to the non-standard bits that are unique to them and implicit regarding the standard bits.

It is a good practice to always use the <header> form of include for all the standard headers and other libraries that are not project specific and to use the "header" form for everything else. Should you have an include directory in your project for your local includes? That depends to some extent on whether those headers will be shipped as interfaces to your libraries or merely consumed by your code, and also on your preferences. How large and complex is your project? If you have a mix of internal and external interfaces or lots of different components, you might want to group things into separate directories.

Keep in mind that the directory structure your finished product unpacks to, need not look anything like the directory structure under which you develop and build that product in. If you have only a few .c/.cpp files and headers, it's ok to put them all in one directory, but eventually, you're going to work on something non-trivial and will have to think through the consequences of your build environment choices, and hopefully document it for others to understand it.

jwdonahue
  • 6,199
  • 2
  • 21
  • 43
  • 2
    I think this is a very comprehensive answer. I'll add that we often hybrid as well- all headers of our libraries that need external linkage (what they actually provide) are grouped in a header directory - so they can be moved easily as a block - which is often a requirement for "install". Headers only needed internally are linked to explicitly and kept with the source, to keep the source portable and group the information. – kabanus Feb 01 '18 at 06:54
  • Answer is good, although to me it's not as much "explicit vs implicit" as "harcoded vs configurable". For instance, explicit.1 is nullified because no serious project goes without a build system, and it then comes naturally to let the build system handle paths. Same for explicit.2 you wouldn't package just the sources, you package the whole, headers, src, data, build setup so it doesn't matter if headers are in another directory within the package. – spectras Feb 01 '18 at 10:46
2

1 . .hpp and .cpp doesn't necessary have 1 to 1 relationship, there may have multiple .cpp using same .hpp according to different conditions (eg:different environments), for example: a multi-platform library, imagine there is a class to get the version of the app, and the header is like that:

Utilities.h

#include <string.h>
class Utilities{
    static std::string getAppVersion();
}

main.cpp

#include Utilities.h
int main(){
    std::cout << Utilities::getAppVersion() << std::ends;
    return 0;
}

there may have one .cpp for each platform, and the .cpp may be placed at different locations so that they are easily be selected by the corresponding platform, eg:

.cpp for iOS (path:DemoProject/ios/Utilities.cpp):

#include "Utilities.h"
std::string Utilities::getAppVersion(){
    //some objective C code
}

.cpp for Android (path:DemoProject/android/Utilities.cpp):

#include "Utilities.h"
std::string Utilities::getAppVersion(){
    //some jni code
}

and of course 2 .cpp would not be used at the same time normally.


2.

#include "file.h" 

instead of

#include "include/file.h" 

allows you to keep the source code unchanged when your headers are not placed in the "include" folder anymore.

ocomfd
  • 4,010
  • 2
  • 10
  • 19