11

I'm working an a very large scale projects, where the compilation time is very long. What tools can I use (preferably open source) on Linux, to find the most heavily included files and that optimize their useages? Just to be clearer, I need a tool which will, given the dependencies, show me which headers are the most included. By the way, we do use distributed compiling

Kara
  • 6,115
  • 16
  • 50
  • 57
user12371
  • 129
  • 1
  • 8
  • Just to be clearer, I need a tool which will, given the dependencies, show me which headers are the most included. By the way, we do use distributed compiling – user12371 Sep 17 '08 at 07:59
  • Perhaps you should edit your question to include this information rather than have it as a comment? – Dominik Grabiec Sep 17 '08 at 08:18

11 Answers11

4

Check out makdepend

INS
  • 10,594
  • 7
  • 58
  • 89
  • This gives me the dependency for each file. I need someting that given this, will find the most included files. – user12371 Sep 17 '08 at 08:04
4

The answers here will give you tools which track #include dependencies. But there's no mention of optimization and such.

Aside: The book "Large Scale C++ Software Design" should help.

Community
  • 1
  • 1
Agnel Kurian
  • 57,975
  • 43
  • 146
  • 217
3

Using the Unix philosophy of "gluing together many small tools" I'd suggest writing a short script that calls gcc with the -M (or -MM) and -MF (OUTFILE) options (As detailed here). That will generate the dependency lists for the make tool, which you can then parse easily (relative to parsing the source files directly) and extract out the required information.

Dominik Grabiec
  • 10,315
  • 5
  • 39
  • 45
  • OMG! Where have I been for decades of hacking? Right from the authoritative source. A small amount of quality time with these options (as detailed in Daemin's solid reply) and some routine yet pertinent hackery, and there it is. Thanks Daemin. – davernator Nov 29 '19 at 23:46
2

Tools like doxygen (used with the graphviz options) can generate dependency graphs for include files... I don't know if they'd provide enough overview for what you're trying to do, but it could be worth trying.

slicedlime
  • 2,142
  • 1
  • 17
  • 16
2

From the root level of the source tree and do the following (\t is the tab character):

find . -exec grep '[ \t]*#include[ \t][ \t]*["<][^">][">]' {} ';'
    | sed 's/^[ \t]*#include[ \t][ \t]*["<]//'
    | sed 's/[">].*$//'
    | sort
    | uniq -c
    | sort -r -k1 -n

Line 1 get all the include lines. Line 2 strips off everything before the actual filename. Line 3 strips off the end of the line, leaving only the filename. Line 4 and 5 counts each unique line. Line 6 sorts by line count in reverse order.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
1

If you wish to know which files are included most of all, use this bash command:

find . -name '.cpp' -exec egrep '^[:space:]#include[[:space:]]+["<][[:alpha:][:digit:]_.]+[">]' {} \;

| sort | uniq -c | sort -k 1rn,1
| head -20

It will display top 20 files ranked by amount of times they were included.

Explanation: The 1st line finds all *.cpp files and extract lines with "#include" directive from it. The 2nd line calculates how many times each file was included and the 3rd line takes 20 mostly included files.

1

Use ccache. It will hash the inputs to a compilation, and cache the results, which will drastically increase the speed of these sorts of compiles.

If you wanted to detect the multiple includes, so that you could remove them, you could use makedepend as Iulian Șerbănoiu suggests:

makedepend -m *.c  -f - > /dev/null

will give a warning for each multiple include.

Joe Hildebrand
  • 10,354
  • 2
  • 38
  • 48
1

Bash scripts found in the page aren't good solution. It works only on simple project. In fact, in large project, like discribe in header page, C-preprocessor (#if, #else, ...) are often used. Only good software more complex, like makedepend or scons can give good informations. gcc -E can help, but, on large project, its result analysis is a wasting time.

Johan Moreau
  • 151
  • 2
0

IIRC gcc could create dependency files.

EricSchaefer
  • 25,272
  • 21
  • 67
  • 103
0

You might want to look at distributed compiling, see for example distcc

Toni Ruža
  • 7,462
  • 2
  • 28
  • 31
0

This is not exactly what you are searchng for, and it might not be easy to setup, but may be you could have a look at lxr : lxr.linux.no is a browseable kernel tree.

In the search box, if you enter a filename, it will give you where it is included. But this is still guessing, and it does not track chained dependencies.

Maybe

strace -e trace=open -o outfile make
grep 'some handy regex to match header' 
shodanex
  • 14,975
  • 11
  • 57
  • 91