16

I have a project that makes massive use of templates. Recently the compilation time rose quite abruptly. I wonder if there's a way for seeing what classes / lines require the most time to be compiled by g++.

Here is some output from -ftime-report:

Execution times (seconds)
 TOTAL                 :   0.30             0.05             0.37               9119 kB

Execution times (seconds)
 garbage collection    :   0.91 ( 6%) usr   0.00 ( 0%) sys   0.92 ( 5%) wall       0 kB ( 0%) ggc
 callgraph construction:   0.23 ( 2%) usr   0.11 ( 3%) sys   0.37 ( 2%) wall   10652 kB ( 1%) ggc
 callgraph optimization:   0.18 ( 1%) usr   0.12 ( 3%) sys   0.28 ( 2%) wall   11906 kB ( 2%) ggc
 varpool construction  :   0.04 ( 0%) usr   0.01 ( 0%) sys   0.08 ( 0%) wall    6984 kB ( 1%) ggc
 cfg construction      :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall     644 kB ( 0%) ggc
 cfg cleanup           :   0.05 ( 0%) usr   0.02 ( 0%) sys   0.05 ( 0%) wall       7 kB ( 0%) ggc
 trivially dead code   :   0.05 ( 0%) usr   0.01 ( 0%) sys   0.12 ( 1%) wall       0 kB ( 0%) ggc
 df scan insns         :   0.37 ( 3%) usr   0.03 ( 1%) sys   0.43 ( 2%) wall     677 kB ( 0%) ggc
 df live regs          :   0.07 ( 0%) usr   0.01 ( 0%) sys   0.02 ( 0%) wall       0 kB ( 0%) ggc
 df reg dead/unused notes:   0.08 ( 1%) usr   0.01 ( 0%) sys   0.08 ( 0%) wall    2755 kB ( 0%) ggc
 register information  :   0.05 ( 0%) usr   0.01 ( 0%) sys   0.05 ( 0%) wall       0 kB ( 0%) ggc
 alias analysis        :   0.01 ( 0%) usr   0.01 ( 0%) sys   0.01 ( 0%) wall     878 kB ( 0%) ggc
 rebuild jump labels   :   0.03 ( 0%) usr   0.01 ( 0%) sys   0.01 ( 0%) wall       0 kB ( 0%) ggc
 preprocessing         :   0.19 ( 1%) usr   0.44 (11%) sys   0.68 ( 4%) wall    5284 kB ( 1%) ggc
 parser                :   3.94 (28%) usr   1.43 (35%) sys   4.94 (27%) wall  355964 kB (48%) ggc
 name lookup           :   1.35 ( 9%) usr   0.88 (21%) sys   2.76 (15%) wall   64919 kB ( 9%) ggc
 inline heuristics     :   0.14 ( 1%) usr   0.03 ( 1%) sys   0.14 ( 1%) wall       0 kB ( 0%) ggc
 integration           :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall      20 kB ( 0%) ggc
 tree gimplify         :   0.31 ( 2%) usr   0.07 ( 2%) sys   0.28 ( 2%) wall   24598 kB ( 3%) ggc
 tree eh               :   0.07 ( 0%) usr   0.02 ( 0%) sys   0.11 ( 1%) wall    7267 kB ( 1%) ggc
 tree CFG construction :   0.04 ( 0%) usr   0.04 ( 1%) sys   0.11 ( 1%) wall   15754 kB ( 2%) ggc
 tree CFG cleanup      :   0.12 ( 1%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall       3 kB ( 0%) ggc
 tree find ref. vars   :   0.03 ( 0%) usr   0.01 ( 0%) sys   0.02 ( 0%) wall     963 kB ( 0%) ggc
 tree PHI insertion    :   0.00 ( 0%) usr   0.01 ( 0%) sys   0.01 ( 0%) wall     351 kB ( 0%) ggc
 tree SSA rewrite      :   0.03 ( 0%) usr   0.01 ( 0%) sys   0.01 ( 0%) wall    4078 kB ( 1%) ggc
 tree SSA other        :   0.03 ( 0%) usr   0.06 ( 1%) sys   0.12 ( 1%) wall    1504 kB ( 0%) ggc
 tree operand scan     :   0.04 ( 0%) usr   0.02 ( 0%) sys   0.08 ( 0%) wall   10781 kB ( 1%) ggc
 dominance computation :   0.15 ( 1%) usr   0.04 ( 1%) sys   0.15 ( 1%) wall       0 kB ( 0%) ggc
 out of ssa            :   0.09 ( 1%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall       0 kB ( 0%) ggc
 expand vars           :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall    1840 kB ( 0%) ggc
 expand                :   0.45 ( 3%) usr   0.04 ( 1%) sys   0.59 ( 3%) wall   37695 kB ( 5%) ggc
 post expand cleanups  :   0.08 ( 1%) usr   0.02 ( 0%) sys   0.06 ( 0%) wall    4542 kB ( 1%) ggc
 varconst              :   0.15 ( 1%) usr   0.03 ( 1%) sys   0.12 ( 1%) wall    3595 kB ( 0%) ggc
 jump                  :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall    1904 kB ( 0%) ggc
 mode switching        :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall       0 kB ( 0%) ggc
 integrated RA         :   1.33 ( 9%) usr   0.09 ( 2%) sys   1.49 ( 8%) wall   18163 kB ( 2%) ggc
 reload                :   0.60 ( 4%) usr   0.10 ( 2%) sys   0.62 ( 3%) wall    8668 kB ( 1%) ggc
 thread pro- & epilogue:   0.17 ( 1%) usr   0.00 ( 0%) sys   0.20 ( 1%) wall   11884 kB ( 2%) ggc
 reg stack             :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall       0 kB ( 0%) ggc
 final                 :   0.71 ( 5%) usr   0.10 ( 2%) sys   0.84 ( 5%) wall    6251 kB ( 1%) ggc
 symout                :   1.10 ( 8%) usr   0.16 ( 4%) sys   1.19 ( 6%) wall  100954 kB (14%) ggc
 uninit var analysis   :   0.03 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall       0 kB ( 0%) ggc
 early local passes    :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall       0 kB ( 0%) ggc
 rest of compilation   :   0.49 ( 3%) usr   0.06 ( 1%) sys   0.76 ( 4%) wall   19252 kB ( 3%) ggc
 unaccounted todo      :   0.43 ( 3%) usr   0.09 ( 2%) sys   0.55 ( 3%) wall       0 kB ( 0%) ggc
 TOTAL                 :  14.26             4.11            18.52             742072 kB
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Fabio Dalla Libera
  • 1,297
  • 1
  • 10
  • 24
  • You're going need to provide much more information than that. It's hard for us to help without anything to look at. – Mysticial Jul 05 '12 at 05:11
  • 1
    You could try compiling different revisions and see when the increase in time occurs; then look at what changes occurred in that revision to hopefully give you some insight. – mparizeau Jul 05 '12 at 05:19
  • It might be worth taking a look at this thread http://stackoverflow.com/questions/2072454/how-do-i-find-out-why-g-takes-a-very-long-time-on-a-particular-file – daz-fuller Jul 05 '12 at 05:25
  • possible duplicate of [Debugging GCC Compile Times](http://stackoverflow.com/questions/6390532/debugging-gcc-compile-times) – devnull Mar 11 '14 at 04:54

3 Answers3

11

Steven Watanabe's template profiler may help you get a per class/function instantiation count.

See Debugging GCC Compile Times for an actual link to this tool.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Joel Falcou
  • 6,247
  • 1
  • 17
  • 34
7

When I read that you have made massive use of templates in your project, my first suspicion was template instantiation, and after seeing the following information, my suspicion became stronger:

parser       : ... (27%) wall 355964 kB (48%) ggc
name lookup  : ... (15%) wall 64919 kB ( 9%) ggc

Since I cannot see the code (as you have not posted it), so I can only suspect. My second suspicion is, you have not used explicit instantiation of the templates for the known types (which you will most certainly use), instead you depend on implicit instantiation and you are using templates from lots of.cpp file. If so, then that could be the major problem, because implicit instantiation causes same templates to be instantiated many times, once for each translation unit. So if you have M templates and you are using it from N translation units (.cpp), then there will be M * N instantiations, instead of just M instantiations.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Nawaz
  • 353,942
  • 115
  • 666
  • 851
  • 1
    Thank you very much. This is the case, same template classes instantiated in many .cpp files. I would try to find out the template classes that require the most time, and modify them. – Fabio Dalla Libera Jul 05 '12 at 06:01
1

AFAIK, no such compilation switch does exist.

A more manual method can be to split between preprocessing and compilation (gcc -E, then gcc -c on preprocessed file) to guess where time is spent.

Another solution is to instrument your build environment to have compilation time per file. Note that I can only recommend to set up continuous integration to track such evolutions early (as soon as it pops up, you detect it without having to dig in the past what introduced the jump).

As a rule of thumb, you can check that only relevant headers are included (try to remove some) or could switch to precompiled headers.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Bruce
  • 7,094
  • 1
  • 25
  • 42
  • Thank you very much. I already have an idea of the files that give the longest compilation times.And, as you point out, those are the ones which include most of the headers. I tried with precompiled headers, but got not so higher speed improvement (still I do not get why). only ccache gave me a speed improvement – Fabio Dalla Libera Jul 05 '12 at 05:22
  • with -Q I get that most of the time is spent on the parser. But I have no idea of what this could mean in term of code to be suspicious of. – Fabio Dalla Libera Jul 05 '12 at 05:26
  • The option `-ftime-report` exists on Mac OS X with `/usr/bin/gcc` -- `i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)`. The fuller report seems to come from `/usr/bin/clang` -- `Apple clang version 2.1 (tags/Apple/clang-163.7.1) (based on LLVM 3.0svn)`. It may be an option to `clang`, or it may just be an option in more recent versions of GCC. – Jonathan Leffler Jul 05 '12 at 05:50
  • thank you, -ftime-report exists in g++, but it gives just what you see in the post – Fabio Dalla Libera Jul 05 '12 at 06:02