8

I am new to precompiled headers, and am just wondering what to include. Our project has about 200 source files.

So, do I literally include every third party library?

If I use a map in three source files, do I add it? What if I use it one, do I add it? Do I need to remove the old direct include or are the ifdef and pragma once directives still working?

Are there any third party libraries you wouldn't add?

Doesn't the precompiled header then get massive?

As in, isn't there an overhead of having all these headers included everywhere all of a sudden, even in precompiled form?

EDIT:

I found some information on clang:

A precompiled header implementation improves performance when:

  • Loading the PCH file is significantly faster than re-parsing the bundle of headers stored within the PCH file. Thus, a precompiled header design attempts to minimize the cost of reading the PCH file. Ideally, this cost should not vary with the size of the precompiled header file.
  • The cost of generating the PCH file initially is not so large that it counters the per-source-file performance improvement due to eliminating the need to parse the bundled headers in the first place. This is particularly important on multi-core systems, because PCH file generation serializes the build when all compilations require the PCH file to be up-to-date.

Clang's precompiled headers are designed with a compact on-disk representation, which minimizes both PCH creation time and the time required to initially load the PCH file. The PCH file itself contains a serialized representation of Clang's abstract syntax trees and supporting data structures, stored using the same compressed bitstream as LLVM's bitcode file format.

Clang's precompiled headers are loaded "lazily" from disk. When a PCH file is initially loaded, Clang reads only a small amount of data from the PCH file to establish where certain important data structures are stored. The amount of data read in this initial load is independent of the size of the PCH file, such that a larger PCH file does not lead to longer PCH load times. The actual header data in the PCH file--macros, functions, variables, types, etc.--is loaded only when it is referenced from the user's code, at which point only that entity (and those entities it depends on) are deserialized from the PCH file. With this approach, the cost of using a precompiled header for a translation unit is proportional to the amount of code actually used from the header, rather than being proportional to the size of the header itself.

To me this seems to indicate that at least clang:

  • has taken care to make load times of precompiled headers independent of size.
  • Use times of precompiled headers are independent of precompiled header size, and are proportional to amount of used information
  • Contrary to answers given so far, this seems to indicate that even when included an external file (say <map>) just once, it is worthwhile including it in the precompiled headers (will still speed up re-compilation of that sourcefile)

There must be some sort of map to map out all the info. This map might get larger, but maybe that isn't so important? Not sure whether I got this right though, or whether it applies to all compilers...

JasonMArcher
  • 14,195
  • 22
  • 56
  • 52
Cookie
  • 12,004
  • 13
  • 54
  • 83
  • i put mostly headers in there, where i know they wont change and that are needed in general. like for example `` or a header file with all the project specifig types or static variables – Zaiborg May 28 '14 at 06:40
  • Generally stuff that doesn't change that is used in more than a few files. Time the build before and after you make any changes to help you decide if it was a good change or not. – Retired Ninja May 28 '14 at 08:16
  • About inclusion of : I still claim "don't do it if you use it in single file". Reasons: even with clang optimizations, the initial table with symbols is getting bigger. If you include single file, it might be insignificant, but what if you include tens or hundreds of such files? Each and every time a source file is compile the table must be loaded. Yes, it's much smaller overhead that in case of loading whole PCH, but I wouldn't count on insignificance of that thing. Another reason is that PCH file might be cached by OS as an optimization. If the file grows bigger, it might exceed the... –  May 28 '14 at 08:52
  • ...the cache (even thought it might seem unlikely). Of course, both my arguments are not measured meaning it might be case of "premature optimization", but still, if there is no gain in adding (if you don't add it and include it in the single source file, it still must be compiled once), no gain at all, why to make PCH file bigger? –  May 28 '14 at 08:56
  • @Laethnes: You wouldn't see a gain if the pch is re-compiled, but you would certainly see a gain if it isn't. Imagine windows.h instead of map. Even if included only once, if that isn't re-compiled that is a clear gain. As to whether it exceeds the cost in other files, I don't know. I was hoping for some best practices... – Cookie May 28 '14 at 09:03
  • @Cookie Yes, you're right, my apologizes. Second part of my last comment is wrong - I finally realized your point and speed-gain. –  May 28 '14 at 09:04
  • @Cookie So, back to your question about speed gain - yes, I think there will be one. If you "just" include for single source file, the file recompilation should be faster. However, what if you have hundreds of such sources and separate includes? Loading and parsing the basic symbol table becomes longer with each added header. So there must be a limit when processing this symbol table takes longer then processing of single header meaning that this practice eventually (when used for too many source files) starts slowing down the compilation instead of speeding up. –  May 28 '14 at 09:07
  • That, essentially, is my question. [This question](http://stackoverflow.com/questions/688053/what-to-put-in-precompiled-header-msvc) seems to indicate for example that every boost and stl header belongs in the precompiled header. – Cookie May 28 '14 at 09:09
  • @Cookie And about best practices - I apologize, but in this regard, I cannot offer anything new, then: try it. Measure it. See it for yourself. Problem is that the most optimal set of headers is depending on your project. Even if you would have a tool, which would calculate the best subset of headers, it would change each time you add/remove file, change includes and even if you would change source file. –  May 28 '14 at 09:12
  • @Cookie I see. I still think best thing is practice - to actually see how it affect the compilation speed. About STL and Boost - I totally agree, but ONLY if you use them. (And ONLY headers you would use.) But I would add also other headers I use often - for example I use, in my projects, GLEW which contains lot's of symbols (plus GL and GLU). –  May 28 '14 at 09:17
  • I disagree. So often people get asked to "try it". What do you even measure? The re-compile time for the single file with include? No need, that will go down. The re-compile time for other files without the include? No need, that will go up. Compile time for changes in x source files? That will take ages. Surely, in a big company, where tens or hundreds of programmers share a code base, the guidance isn't "try and measure" to every programmer. And as you say, it depends on circumstances, so the whole exercise needs constant repeating. There has got to be guidance out there. – Cookie May 28 '14 at 09:18
  • @Laethnes Ok. Btw. the measuring problem is reason why I think this thing is really depending on given project - but that's only my opinion. –  May 28 '14 at 10:16

2 Answers2

8

There is no 100% exact answer to this as it depends on your project. Best thing is to try it yourself and see what happens.

However,

"So, do I literally include every third party library? "

No, basically you include headers, which:

  • Are used often by your sources. "Often" however is not quite defined, but let's say it's used by more then 10% of your source files. (I picked the number randomly, however. Maybe it should be bigger.)
  • Are not changed the most of the time (because change of single header means you need to recompile all your sources). Third party libraries are not expected to be changed, so they're best candidate, but you can also use headers from your own project, too, if you're sure they will be changed rarely or in exceptional cases.

But don't "just" include all headers of a library. Include ones you're using.

"If I use a map in three source files, do I add it?"

See above. There is no clear answer to this, but personally I think three source files is too little.

"What if I use it one, do I add it?"

(I understand the question as it would be "What if I use a header file in single source and add it to precompiled header?")

Nothing what would break your application. But it will make:

  • Compilation of precompiled header longer.
  • Precompiled header file bigger.
  • Compilation of source files slower.

In case of single header it's completely insignificant if the header is of average size. However if you add hundreds of such headers, you slow down whole compilation.

"Do I need to remove the old direct include or are the ifdef and pragma once directives still working?"

You probably could do such thing, but I highly recommend NOT to do it. You're not required to do so, however.

You can imagine that precompiled header is nothing but just include of the header at beginning of your all sources. Example:

precompiled.h

#include <iostream>
#include <string>

MyClass.cpp

#include "MyClass.h"

MyClass::MyClass()
{
// etc.

Now let's say you enable precompiled header. For source file it's same as if you would write:

#include "precompiled.h"
#include "MyClass.h"

MyClass::MyClass()
{
// etc. 

Can you do that normally? Yes, you can! Precompiled header acts like that (but it's faster), which means:

  • Yes, macros are preserved. Whatever is defined in precompiled headers is defined in all source files, which means:
    • Guards are working normally.
    • If there is a library which detects OS, all the macros are still defined and available.
    • If there is other macro defined (for example MIN (although using std::min is now recommended)) you can still use it normally.
  • I don't know about pragma, but I believe it works normally, too.

About removing includes in sources: as I stated above, I'm highly against it. Reason is simple: what if you would need to turn off precompiled headers in future? (In fact, I personally turn off the precompiled header from time to time to see whether my code still compiles. My personal reason is that if you release the code, some users will not use your project/make files but they will create their own project (for example if they use different IDE like Code::Blocks or QtCreator) and so I try to make my project in such way that all you need is to add source files, configure correct include path, link correct libraries and it should compile.)

"Are there any third party libraries you wouldn't add?"

I can't think of any...

On the other hand, I can think of some which are IMHO best to be added (if you use them) - for example boost. It uses, most of the time, templates - and to according my personal experience, templates are slowing compilation the most, because you need to include not onlu declarations, but also definitions. IMHO that's the biggest weakness of C++ templates.

"Doesn't the precompiled header then get massive? "

They can. That's why you need to find the best subset of header files (and not to include blindly everything) to get optimal result. Mine is about 50MB big, but still speeding up the compilation very much. (Whole minutes as I use templates quite a lot.)

"As in, isn't there an overhead of having all these headers included everywhere all of a sudden, even in precompiled form?"

If you use precompiled header, you prepare some set of headers and include them to all source files. This means, from point of single source file, that you include some headers which are not needed by the source file. That is the overhead. However, inclusion of precompiled headers is much faster, so if you include a few unneeded headers, it will be still faster. However when you cross some limit (let's say that in case when for more than 90% sources more than 90% included headers are unneeded) when using precompiled headers starts slow down the compilation. That's why you need to include headers which are used mostly and avoid inclusion of headers which are included only in few source files (or not used at all).

Generally, using precompiled headers increase needed space on disk (in these times, absolutely insignificantly) and needed space in RAM (again, in these times not much important). It's perfect example of "getting more speed at the cost of memory".


The last advice is simple: try it yourself. Check what happens when you add headers and when you feel your compilation is slow, check whether you include something which is mostly not used.

2

As you may know, compiling C/C++ source code is a time-consuming task and one reason for this is the fact that compiler needs to compile every chunk of code that you directly and indirectly include into your source which is, in most cases, a redundancy because most of included files are libraries that will not change over time. To alleviate this matter, the notion of precompiled headers are introduced. Through precompiled headers one can tell compiler that a set of include files are very unlikely to change over time and this way, compiler can optimize the compilation process by compiling the specified files for once and saving the results. Then whenever compiler needs to compile the project, it skips the compilation of specified sources and reuses these saved compiled files.

So, it is a good idea to include files that are not subject to frequent changes in precompiled headers. Of course you can add codes that change frequently in them but that will ignore the whole point of using precompiled headers.

And by the way, do not worry about precompiled headers getting massive. The concept is mainly designed to reduce compilation time of large projects in which there are usually a plethora of third party libraries. In these cases, these files normally should and will get massive.

See also Wikipedia's entry on Precompiled Headers.

MxNx
  • 1,342
  • 17
  • 28