26

I am very interested in some studies or empirical data that shows a comparison of compilation times between two c++ projects that are the same except one uses forward declarations where possible and the other uses none.

How drastically can forward declarations change compilation time as compared to full includes?

#include "myClass.h"

vs.

class myClass;

Are there any studies that examine this?

I realize that this is a vague question that greatly depends on the project. I don't expect a hard number for an answer. Rather, I'm hoping someone may be able to direct me to a study about this.

The project I'm specifically worried about has about 1200 files. Each cpp on average has 5 headers included. Each header has on average 5 headers included. This regresses about 4 levels deep. It would seem that for each cpp compiled, around 300 headers must be opened and parsed, some many times. (There are many duplicates in the include tree.) There are guards, but the files are still opened. Each cpp is separately compiled with gcc, so there's no header caching.

To be sure no one misunderstands, I certainly advocate using forward declarations where possible. My employer, however, has banned them. I'm trying to argue against that position.

Thank you for any information.

JoshD
  • 12,490
  • 3
  • 42
  • 53
  • How drastically forward declarations change compilation time in comparison with what? U say in comparison when one uses fd, and the other uses none. What does the other use? – Armen Tsirunyan Oct 18 '10 at 19:41
  • @Armen Tsirunyan: In comparison with including the header file instead. I'll clarify... – JoshD Oct 18 '10 at 19:42
  • 2
    @JoshD: Doesn't it depend on what does the header contain? If the header contains a 1000 class definitions of course it will compile much slower than a fwddecl. Even if it contains one class which is huuuuge, it will compile slower... It's a cliche, but I'd say "It depends" :) – Armen Tsirunyan Oct 18 '10 at 19:45
  • 3
    If the worst problem at your shop is the compile-time difference between forward declarations and not... where do you work and are they hiring?!? ;-) – Chris Tonkinson Oct 18 '10 at 19:47
  • @Chris: Oh how I wish... yesterday I found twenty instances of `if (a < x < b)` in one of our libraries. (This was c++). – JoshD Oct 18 '10 at 19:51
  • @Chris - a contrarian view is that I find legislating this type of thing to be an antipattern in an employer. What do they do, slate you in your review if you do this? – Steve Townsend Oct 18 '10 at 19:57
  • Banned them? That makes no sense. What's the accepted solution for pairs of classes that refer to each other in their interface or are they banned too? – CB Bailey Oct 18 '10 at 20:04
  • @Charles Bailey: They'll make an exception for that. They don't want to have forward declarations used in place of `#include` directives. – JoshD Oct 18 '10 at 20:08
  • 2
    Is there another exception for the pimpl idiom, too? Or is that banned? Compile times is _one_ argument for using forward declarations where possible but it's not the first one I'd use. – CB Bailey Oct 18 '10 at 20:12
  • 2
    It all depends on the definition of "slow," too. The last large C++ project on which I worked was on the order of 1 million SLOC (not including third party libraries). There was only one time that using a forward declaration improved compilation time and it was due to a quirk related to our use of Boost.Bimap (I still don't know what the problem was; that header just killed our compile times). We didn't use forward declarations much at all and the whole thing built in 10 minutes. Incremental rebuilds were on the order of seconds. – James McNellis Oct 18 '10 at 20:22
  • @James McNellis: Boost is a killer for compile time because of templates. And one of the way templates will be optimized in c++0x is... forward declarations :) – Matthieu Oct 18 '10 at 20:24
  • @Matthieu: We used Boost extensively; it was only that one Boost library that significantly impacted compilation performance. – James McNellis Oct 18 '10 at 20:28
  • 1
    @James McNellis: One million SLOC building in 10 minutes is very good. In my experience, it tends to be spaghetti dependencies where every source file needs to include every header file either directly or indirectly that kills build times rather than not using forward declarations where possible. Converting a well designed project with sensible dependencies to use forward declarations is easy to do, but gives only a modest benefit. Untangling a massive mess of dependencies can be very difficult and time consuming but can produce stunning speed ups. – CB Bailey Oct 18 '10 at 20:50
  • I am not going to post an answer because I can't find a reference to it. I once read an article where build times were reduced by 90% by building one big file and compiling that instead of individual files. – stonemetal Oct 18 '10 at 20:54
  • @stonemetal: I wonder if it also compared incremental build times :) – JoshD Oct 18 '10 at 20:56
  • 2
    @JoshD: "There are guards, but the files are still opened. Each cpp is separately compiled with gcc". For what it's worth, GCC is supposed to be able to optimise this. It should recognise include guards, and doesn't re-pre-process the header if the guard would mean it does nothing. http://gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros.html. How are you confirming that the files really are opened for a second time, `strace`? Maybe your include guards are non-idiomatic, so that GCC doesn't recognise them. – Steve Jessop Oct 18 '10 at 23:18
  • One possibly-good reason I can think of for not using a forward declaration in place of `#include`, is that in future the name might not be a class, it might be a typedef. Whether that implementation detail really needs to be free to change is another matter. – Steve Jessop Oct 18 '10 at 23:34

7 Answers7

21

Forward declarations can make for neater more understandable code which HAS to be the goal of any decision surely.

Couple that with the fact that when it comes to classes its quite possible for 2 classes to rely upon each other which makes it a bit hard to NOT use forward declaration without causing a nightmare.

Equally forward declaration of classes in a header means that you only need to include the relevant headers in the CPPs that actually USE those classes. That actually DECREASES compile time.

Edit: Given your comment above I would point out it is ALWAYS slower to include a header file than to forward declare. Any time you include a header you are necessitating a load from disk often only to find out that the header guards mean that nothing happens. That would waste immense amounts of time and is really a VERY stupid rule to be bringing in.

Edit 2: Hard data is pretty hard to obtain. Anecdotally, I once worked on a project that wasn't strict about its header includes and the build time was roughly 45 minute on a 512MB RAM P3-500Mhz (This was a while back). After spending 2 weeks cutting down the include nightmare (By using forward declarations) I had managed to get the code to build in a little under 4 minutes. Subsequently using forward declarations became a rule whenever possible.

Edit 3: Its also worth bearing in mind that there is a huge advantage from using forward declarations when it comes to making small modifications to your code. If headers are included all over the shop then a modification to a header file can cause vast amounts of files to be rebuilt.

I also note lots of other people extolling the virtues of pre-compiled headers (PCHs). They have their place and they can really help but they really shouldn't be used as an alternative to proper forward declaration. Otherwise modifications to header files can cause issues with recompilation of lots of files (as mentioned above) as well as triggering a PCH rebuild. PCHs can provide a big win for things like libraries that are pre-built but they are no reason not to use proper forward declarations.

Goz
  • 61,365
  • 24
  • 124
  • 204
  • 11
    How do forward declarations make for neater code? I'd argue that they significantly obfuscate code and hide dependencies, making it far more difficult to understand code. – James McNellis Oct 18 '10 at 19:46
  • @James: Well it depends on whether the forward declaration is just marking a function that is called later in the same file (in this case it can mean you can structure your code far more sensibly by grouping functions together that otherwise would have interdependency nightmares). That makes code neater, IMO. – Goz Oct 18 '10 at 19:49
  • That last paragraph touches what I'm after. I expect the decrease to be quite significant, but I'd like some hard data to back that. I was hoping to obtain some rather than making my own. – JoshD Oct 18 '10 at 19:49
  • 2
    Edit2 is fantastic. Granted it's anecdotal, but that's still better than nothing. Edit1: I completely agree. – JoshD Oct 18 '10 at 20:03
  • 2
    *Any time you include a header you are necessitating a load from disk often only to find out that the header guards mean that nothing happens.* Not true. If the header has already been loaded for that TU, chances are good it's in the OS-level filesystem cache; and even if not, compilers can recognize headers that use include guards and optimize that case. –  Oct 19 '10 at 06:32
  • I suspect you're talking about msvc's pch model (which is per-TU) that differs from gcc's (which is per header). –  Oct 19 '10 at 06:33
  • @Roger: Even if its pre-compiled per TU a change to a header file that is included somewhere in the massive chain of header file nightmares will surely cause a re-build of that TU? And as its included all over the place it will trigger rebuilds of multiple TUs ... surely? Equally I've always wondered why they don't just cache the header guard details but it seems, to me, that they don't. Equally though, there will necessarily be a whole load of associated processing if it doesn't get header guarded out. Much of it unnecessary. – Goz Oct 19 '10 at 06:40
  • @Goz: Yes, that's how msvc's PCHs work: the set of pre-compiled headers is the same for all TUs, and all TUs must use the single PCH. Yes, changing anything included in that "PCH bundle" means you have to recompile every TU. This is why I'm thinking you're talking about msvc's pch model. Gcc has [cached header guards](http://gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros.html) for years, and I've heard the latest version(s) of msvc can also do that. (I guess "per-TU" could be interpreted various ways; should've been more clear about that.) –  Oct 19 '10 at 06:44
  • So are you saying that changing a header in the PCH bundle doesn't cause ltos of rebuilds under GCC? Because thats what I'm trying, maybe unsuccessfully, to say :) As for cached header guards. Fair enough .. however its still going to be slower (marginally) to check against a cache than to not do so at all ... – Goz Oct 19 '10 at 07:41
  • @Goz: (I don't get the notification without @Roger, btw.) Gcc's pre-compiled header model *doesn't have* a pch bundle: each header is independently compiled and cached, then the cached version is checked when the header is included anywhere. (I'm not saying gcc's model is better than msvc's or vice versa; both have advantages.) You can't compare checking a cache against not doing anything; you have to compare checking a cache against manually keeping that information yourself. (And which of those two makes more sense depends on the project/situation specifics.) –  Oct 20 '10 at 05:19
  • @Roger: Didn't know about the @ thing btw. I just usually do it because it makes comment discussions easier to follow :) Fair enough on the GCC header files. I assumed that it would compile a whole set of headers (ie header A includes headers B, C, D which include E, F, G, etc). Definitely interesting to know. Still, though, even using the pre-compiled header will be slower than not including the header at all ... – Goz Oct 20 '10 at 07:37
  • @Goz: I think we're going in circles, comparing isn't useful without equivalents. Yes, it can sometimes be faster to use a declaration instead, but the trade-off is manual synchronization, etc. and that trade-off can be worthwhile sometimes (as I said in my last comment). However, I'll end with at least two standard library headers for which I believe including is always faster: [ciso646](http://bitbucket.org/rdpate/stdtags/src/03766f859aa5/c++03/ciso646) and [iso646.h](http://bitbucket.org/rdpate/stdtags/src/03766f859aa5/c++03/iso646.h). (A conforming implementation only needs empty files. ;) –  Oct 20 '10 at 08:43
  • 1
    @James Forward declarations are cleaner because `#includes` are a box of chocolates - you never know what you're gonna git – bobobobo Jul 21 '11 at 19:00
11

Have a look in John Lakos's excellent Large Scale C++ Design book -- I think he has some figures for forward declaration by looking at what happens if you include N headers M levels deep.

If you don't use forward declarations, then aside from increasing the total build time from a clean source tree, it also vastly increases the incremental build time because header files are being included unnecessarily. Say you have 4 classes, A, B, C and D. C uses A and B in its implementation (ie in C.cpp) and D uses C in its implementation. The interface of D is forced to include C.h because of this 'no forward declaration' rule. Similarly C.h is forced to include A.h and B.h, so whenever A or B is changed, D.cpp has to be rebuilt even though it has no direct dependency. As the project scales up this means that if you touch any header it'll have a massive effect on causing huge amounts of code to be rebuilt that just doesn't need to be.

To have a rule that disallows forward declaration is (in my book) very bad practice indeed. It's going to waste huge amounts of time for the developers for no gain. The general rule of thumb should be that if the interface of class B depends on class A then it should include A.h, otherwise forward declare it. In practice 'depends on' means inherits from, uses as a member variable or 'uses any methods of'. The Pimpl idiom is a widespread and well understood method for hiding the implementation from the interface and allows you to vastly reduce the amount of rebuilding needed in your codebase.

If you can't find the figures from Lakos then I would suggest creating your own experiments and taking timings to prove to your management that this rule is absolutely wrong-headed.

the_mandrill
  • 29,792
  • 6
  • 64
  • 93
  • 3
    Note that _Large Scale C++ Design_ was published in 1996. There have been huge improvements in compiler performance since then (most notably, I don't think precompiled headers were supported by most compilers in 1996). – James McNellis Oct 18 '10 at 20:47
  • Thank you very much. This is quite helpful. – JoshD Oct 18 '10 at 20:50
  • 2
    @James: yes, precompiled headers and multithreaded/parallelising compilers have moved on a long way, but also our company's codebase has also vastly increased in size since 1996. I think the core tenets of the book are as relevant today as they were back then. – the_mandrill Oct 18 '10 at 20:55
5

I made a small demo which generates artificial codebase and tests this hypothesis. It generates 200 headers. Each header has a struct with 100 fields and a comment 5000 bytes long. 500 .c files are used for benchmarking, each includes all the header files or forward declares all the classes. To make it more realistic, each header is also included into it's own .c file

The result is that using includes took me 22 seconds to compile while using forward declarations took 9 seconds.

generate.py

#!/usr/bin/env python3

import random
import string

include_template = """#ifndef FILE_{0}_{1}
#define FILE_{0}_{1}

{2}
//{3}

struct c_{0}_{1} {{
{4}}};

#endif
"""

def write_file(name, content):
    f = open("./src/" + name, "w")
    f.write(content)
    f.close()

GROUPS = 200
FILES_PER_GROUP = 0
EXTRA_SRC_FILES = 500
COMMENT = ''.join(random.choices(string.ascii_uppercase + string.digits, k=5000))
VAR_BLOCK = "".join(["int var_{0};\n".format(k) for k in range(100)])

main_includes = ""
main_fwd = ""
for i in range(GROUPS):
    include_statements = ""
    for j in range(FILES_PER_GROUP):
        write_file("file_{0}_{1}.h".format(i,j), include_template.format(i, j, "", COMMENT, VAR_BLOCK))
        write_file("file_{0}_{1}.c".format(i,j), "#include \"file_{0}_{1}.h\"\n".format(i,j))
        include_statements += "#include \"file_{0}_{1}.h\"\n".format(i, j)
        main_includes += "#include \"file_{0}_{1}.h\"\n".format(i,j)
        main_fwd += "struct c_{0}_{1};\n".format(i,j)
    write_file("file_{0}_x.h".format(i), include_template.format(i, "x", include_statements, COMMENT, VAR_BLOCK))
    write_file("file_{0}_x.c".format(i), "#include \"file_{0}_x.h\"\n".format(i))
    main_includes += "#include \"file_{0}_x.h\"\n".format(i)
    main_fwd += "struct c_{0}_x;\n".format(i)

main_template = """
{0}

int main(void) {{ return 0; }}

"""

for i in range(EXTRA_SRC_FILES):
    write_file("extra_inc_{0}.c".format(i), main_includes)
    write_file("extra_fwd_{0}.c".format(i), main_fwd)

write_file("maininc.c", main_template.format(main_includes))
write_file("mainfwd.c", main_template.format(main_fwd))


run_test.sh

#!/bin/bash

mkdir -p src
./generate.py
ls src/ | wc -l
du -h src/
gcc -v
echo src/file_*_*.c src/extra_inc_*.c src/mainfwd.c | xargs time gcc -o fwd.out
rm -rf out/*.a
echo src/file_*_*.c src/extra_fwd_*.c src/maininc.c | xargs time gcc -o inc.out
rm -rf fwd.out inc.out src

Results

$ ./run_test.sh 
    1402
8.2M    src/
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/4.2.1
Apple clang version 11.0.3 (clang-1103.0.32.29)
Target: x86_64-apple-darwin19.3.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
       22.32 real        13.56 user         8.27 sys
        8.51 real         4.44 user         3.78 sys

Henrique Jung
  • 1,408
  • 1
  • 15
  • 23
Artium
  • 5,147
  • 8
  • 39
  • 60
4
#include "myClass.h"

is 1..n lines

class myClass;

is 1 line.

You will save time unless all your headers are 1 liners. As there is no impact on the compilation itself (forward reference is just way to say to the compiler that a specific symbol will be defined at link time, and will be possible only if the compiler doesnt need data from that symbol (data size for example)), the reading time of the files included will be saved everytime you replace one by forward references. There's not a regular measure for this as it is a per project value, but it is a recommended practice for large c++ projects (See Large-Scale C++ Software Design / John Lakos for more info about tricks to manage large projects in c++ even if some of them are dated)

Another way to limit the time passed by the compiler on headers is pre-compiled headers.

Matthieu
  • 4,605
  • 4
  • 40
  • 60
  • 1
    There is only a *very loose* relationship between LOC and time to compile. Very, *very* loose. –  Oct 19 '10 at 06:35
2

You've asked a very general question that's elicited some very good general answers. But your question wasn't about your actual problem:

To be sure no one misunderstands, I certainly advocate using forward declarations where possible. My employer, however, has banned them. I'm trying to argue against that position.

We have some information on the project, but not enough:

The project I'm specifically worried about has about 1200 files. Each cpp on average has 5 headers included. Each header has on average 5 headers included. This regresses about 4 levels deep. It would seem that for each cpp compiled, around 300 headers must be opened and parsed, some many times. (There are many duplicates in the include tree.) There are guards, but the files are still opened. Each cpp is separately compiled with gcc, so there's no header caching.

What have you done towards using gcc's precompiled headers? What difference does it make in compile times?

How long does it take to compile a clean build now? How long are your typical (non-clean/incremental) builds? If, as in James McNellis' example in comments, build times are under a minute:

The last large C++ project on which I worked was on the order of 1 million SLOC (not including third party libraries). ... We didn't use forward declarations much at all and the whole thing built in 10 minutes. Incremental rebuilds were on the order of seconds.

Then it doesn't really matter how much time would be saved by avoiding includes: shaving seconds off builds surely won't matter for many projects.

Take a small representative portion of your project and convert it to what you'd like it to be. Measure the differences in compilation time between the unconverted and the converted versions of that sample. Remember to touch (or the equivalent of make --assume-new) various sets of files to represent real builds you'd encounter while working.

Show your employer how you'd be more productive.

1

Uhmm, the question is so unclear. And it depends, to be simple.

In an arbitrary scenario i think translation units will not become shorter and easier to compile. The most regarded intent of forward-declarations is to provide convinience to the programmer.

Keynslug
  • 2,676
  • 1
  • 19
  • 20
0

For people using MS Visual Studio, check out a great plugin called Compile Score by Ramon Viladomat.

It pulls information from Clang or MSBuild (pdb) and shows how much time each file operation takes within the entire build run, separating front-end (pre-processor work) from back-end (actual code gen). You can even see which .cpp files included a specific .h and search for the low hanging fruit to speed up your builds. Lots of options and nifty features. Def. worth a try if you have large projects.

StarShine
  • 1,940
  • 1
  • 27
  • 45