There are basically two things that cause long compilation times: too many includes and too many templates.
When you are including too many headers and that these headers are including too many headers of their own, it just means that the compiler has a lot of work to do to load all these files and it will spend an inordinate amount of time on the processing passes that it has to do on all code, regardless of whether its actually used or not, like pre-processing, lexical analysis, AST building, etc.. This can be especially problematic when code is spread over a large number of small headers, because the performance is very much I/O bound (lots of time wasted just fetching and reading files from the hard-disk). Unfortunately, Boost libraries tend to be very much structured this way.
Here are a couple of ways or tools to solve this problem:
- You can use the "include-what-you-use" tool. This is a Clang-based analysis tool that basically looks at what you are actually using in your code, and which headers those things come from, and then reports on any potential optimizations you could make by removing certain unnecessary includes, using forward-declarations instead, or maybe replace the broader "all-in-one" headers with the more fine-grained headers.
- Most compilers have options to dump the preprocessed sources (on GCC / Clang, it's
-E
or -E -P
options, or simply used GCC's C preprocessor program cpp
directly). You can take your source file and comment out different include statements or groups of include statements, and dump the preprocessed source to see the total amount of code that these different headers pull in (and maybe use a line count command, like $ g++ -E -P my_source.cpp | wc -l
). This could help you identify, in sheer number of lines of code to process, which headers are the worst offenders. Then, you can see what you can do to avoid them or mitigate the issue somehow.
- You can also use pre-compiled headers. This is a feature supported by most compilers with which you can specify certain headers (especially oft-included "all-in-one" headers) to be pre-compiled to avoid re-parsing them for every source file that includes them.
- If your OS supports it, you can use a ram-disk for your code and the headers of your external libraries. This essentially takes up part of your RAM memory and makes it look like a normal hard-disk / file-system. This can significantly reduce compilation times by reducing the I/O latency, since all the headers and source files are read from RAM memory instead of the actual hard-disk.
The second problem is that of template instantiations. In your time report from GCC, there should be a time value reported somewhere for template instantiation phase. If that number is high, which it will be as soon as there is any significant amount of templates meta-programming involved in the code, then you will need to work on that problem. There are lots of reasons why some template-heavy code can be painfully slow to compile, including deeply recursive instantiation patterns, overly fancy Sfinae tricks, abuse of type-traits and concepts checking, and good old fashion over-engineered generic code. But there are also simple tricks that can fix a lot of issues, like using unnamed namespaces (to avoid all the time wasted generating symbols for instantiations that don't really need to be visible outside the translation unit) and specializing type-traits or concept checks templates (to basically "short-circuit" much of the fancy meta-programming that goes into them). Another potential solution for template instantiations is to use "extern templates" (from C++11) to control where specific template instantiations should be instantiated (e.g., in a separate cpp file) and avoid re-instantiating it everywhere it's used.
Here are a couple of ways or tools to help you identify the bottlenecks:
- You can use the "Templight" profiling tool (and its auxiliary "Templight-tools" for dealing with the traces). This is again a Clang-based tool that can be used as a drop-in replacement for the Clang compiler (the tool is actually an instrumented full-blown compiler) and it will generate a complete profile of all the template instantiations that occur during compilation, including the time spent on each (and optionally, memory consumption estimates, although this will affect the timing values). The traces can later be converted to a Callgrind format and be visualized in KCacheGrind, just read the description on that on the templight-tools page. This can basically be used like a typical run-time profiler, but for profiling the time and memory consumption when compiling template-heavy code.
- A more rudimentary way of going about finding the worst offenders is to create test source files that instantiate particular templates that you suspect are responsible for the long compilation times. Then, you compile those files, time it, and try to work your way (maybe in a "binary search" fashion) towards the worst offenders.
But even with these tricks, identifying template instantiation bottlenecks is easier than actually solving them. So, good luck with that.