76

I know PC-Lint can tell you about headers which are included but not used. Are there any other tools that can do this, preferably on linux?

We have a large codebase that through the last 15 years has seen plenty of functionality move around, but rarely do the leftover #include directives get removed when functionality moves from one implementation file to another, leaving us with a pretty good mess by this point. I can obviously do the painstaking thing of removing all the #include directives and letting the compiler tell me which ones to reinclude, but I'd rather solve the problem in reverse - find the unused ones - rather than rebuilding a list of used ones.

Kara
  • 6,115
  • 16
  • 50
  • 57
Nick Bastin
  • 30,415
  • 7
  • 59
  • 78
  • 2
    It is notoriously hard to find something that isn't there. –  Aug 19 '09 at 18:40
  • This is a problem I've hit before, and not yet found a 100% reliable automated solution - I'm interested to see what answers we get. – DaveR Aug 19 '09 at 18:47
  • 1
    @Neil: In general that's true, but in this specific case it's not that hard (in the abstract). You "merely" identify all the symbols in the file, match them against the headers that satisfy them, and then prune out the headers that weren't used in that process. Of course, in reality it's complicated because you need a C/C++ parser and the definition of "required" is looser than you would want to make this process "easy". – Nick Bastin Aug 19 '09 at 19:03
  • 1
    @Nick and then you have headers which are used only on a platform or when compiling in some configuration, you have headers which provides all their symbols by including private headers which client code shouldn't include directly, you have headers which include another to be self-sufficient but you don't use the interface for which that other include is needed, ... – AProgrammer Aug 19 '09 at 19:09
  • @AProgrammer: Only being used on one platform is relatively easy to resolve - an analysis tool is going to preprocess them right out anyhow (which should also happen in your "some configuration" case). I'm not looking for headers that are listed in the file but properly preprocessed out - I'm looking for headers which include completely unnecessary source in the finished object code. Also, as for private headers, that's fine - they'll still be "used" in most cases (or they were unnecessary - a useful thing to know). – Nick Bastin Aug 19 '09 at 19:34
  • possible duplicate of [How should I detect unnecessary #include files in a large C++ project?](http://stackoverflow.com/questions/74326/how-should-i-detect-unnecessary-include-files-in-a-large-c-project) – Troubadour May 05 '12 at 22:14
  • 1
    possible duplicate of [C/C++: Detecting superfluous #includes?](http://stackoverflow.com/questions/614794/c-c-detecting-superfluous-includes) – Josh Kelley Jul 24 '13 at 17:47

9 Answers9

31

DISCLAIMER: My day job is working for a company that develops static analysis tools.

I would be surprised if most (if not all) static analysis tools did not have some form of header usage check. You could use this wikipedia page to get a list of available tools and then email the companies to ask them.

Some points you might consider when you're evaluating a tool:

For function overloads, you want all headers containing overloads to be visible, not just the header that contains the function that was selected by overload resolution:

// f1.h
void foo (char);

// f2.h
void foo (int);


// bar.cc
#include "f1.h"
#include "f2.h"

int main ()
{
  foo (0);  // Calls 'foo(int)' but all functions were in overload set
}

If you take the brute force approach, first remove all headers and then re-add them until it compiles, if 'f1.h' is added first then the code will compile but the semantics of the program have been changed.

A similar rule applies when you have partial and specializations. It doesn't matter if the specialization is selected or not, you need to make sure that all specializations are visible:

// f1.h
template <typename T>
void foo (T);

// f2.h
template <>
void foo (int);

// bar.cc
#include "f1.h"
#include "f2.h"


int main ()
{
  foo (0);  // Calls specialization 'foo<int>(int)'
}

As for the overload example, the brute force approach may result in a program which still compiles but has different behaviour.

Another related type of analysis that you can look out for is checking if types can be forward declared. Consider the following:

// A.h
class A { };

// foo.h
#include "A.h"
void foo (A const &);

// bar.cc
#include "foo.h"

void bar (A const & a)
{
  foo (a);
}

In the above example, the definition of 'A' is not required, and so the header file 'foo.h' can be changed so that it has a forward declaration only for 'A':

// foo.h
class A;
void foo (A const &);

This kind of check also reduces header dependencies.

Richard Corden
  • 21,389
  • 8
  • 58
  • 85
  • Most that I have looked at do not have a header usage check of this nature. You make a very good point about overloads and specializations, but thankfully our conventions are such that these would basically never be in different headers. – Nick Bastin Aug 20 '09 at 21:03
  • Also, I've been down the road with that wikipedia page. The C/C++ section is very weak...I suppose I should go down the list of commercial providers and see which ones support C++. Also, I'm perfectly fine with people suggesting their own product - it's more than I had to go on before, and your advice in general is very informative. – Nick Bastin Aug 20 '09 at 21:06
  • "For function overloads, you want all headers containing overloads to be visible, not just the header that contains the function that was selected by overload resolution: ..." +1, that's potential nightmare to debug and a big reason I'm afraid of doing this myself – Andres Salas Apr 03 '19 at 23:52
23

Here's a script that does it:

#!/bin/bash
# prune include files one at a time, recompile, and put them back if it doesn't compile
# arguments are list of files to check
removeinclude() {
    file=$1
    header=$2
    perl -i -p -e 's+([ \t]*#include[ \t][ \t]*[\"\<]'$2'[\"\>])+//REMOVEINCLUDE $1+' $1
}
replaceinclude() {
   file=$1
   perl -i -p -e 's+//REMOVEINCLUDE ++' $1
}

for file in $*
do
    includes=`grep "^[ \t]*#include" $file | awk '{print $2;}' | sed 's/[\"\<\>]//g'`
    echo $includes
    for i in $includes
    do
        touch $file # just to be sure it recompiles
        removeinclude $file $i
        if make -j10 >/dev/null  2>&1;
        then
            grep -v REMOVEINCLUDE $file > tmp && mv tmp $file
            echo removed $i from $file
        else
            replaceinclude $file
            echo $i was needed in $file
        fi
    done
done
Andy C
  • 231
  • 2
  • 3
  • I've used the same method myself, if your using GCC be sure to compile with `-Werror=missing-prototypes` otherwise you can remove headers to functions defined in the source file which can cause problems later (you wont notice if the header gets out of sync). – ideasman42 Apr 03 '13 at 15:12
  • nice! probably will not scale well on a big project but exactly what i needed for a small project! thanks! However.. i think that it should be first be run on all.h and then on all cpp... – Stefano Jun 02 '14 at 09:13
  • This is great. I wonder if this brute force solution could be extended to handle the cases in the comments of https://stackoverflow.com/a/7111685/148668 above. – Alec Jacobson Dec 15 '20 at 15:11
  • That does not work ... consider you have a header wich consists of 2 #includes. Your code only needs info from one of them, so you could replace "funcs.h" with "func1.h" and remove the need for "func2.h" – David V. Corbin Aug 03 '23 at 16:08
5

Have a look at Dehydra.

From the website:

Dehydra is a lightweight, scriptable, general purpose static analysis tool capable of application-specific analyses of C++ code. In the simplest sense, Dehydra can be thought of as a semantic grep tool.

It should be possible to come up with a script that checks for unused #include files.

Ton van den Heuvel
  • 10,157
  • 6
  • 43
  • 82
5

Google's cppclean seems to do a decent job of finding unused header files. I just started using it. It produces a few false positives. It will often find unnecessary includes in header files, but what it will not tell you is that you need a forward declaration of the associated class, and the include needs to be moved to the associated source file.

Chance
  • 2,653
  • 2
  • 26
  • 33
  • `cppclean` cleans too much. If I have a header file `foo.h` that explicitly uses functionality/types defined in `bar.h` and in `baz.h` I would expect to see `foo.h` have a `#include "bar.h"` and a `#include "baz.h"`. Suppose that `bar.h` also happens to `#include "baz.h"`. This does not mean that I can get rid of the `#include "baz.h"` in `foo.h`. Most unneeded header file checks will say that I should get rid of it. This is a false positive, which are almost as bad as false negatives. (And maybe worse; too many false positives and I'll stop using the tool. Lint is a good example.) – David Hammen Aug 21 '11 at 00:37
  • @David, I agree that it produces many false positives, but I feel that it is better than manually examining each file, and false positives are quickly spotted and remedied. Do you have something you use that works pretty well? – Chance Aug 21 '11 at 20:01
  • 3
    Today my file my be able to get by without `#include "baz.h"`. Tomorrow, maybe not. Suppose `bar.h` is your responsibility and you too are in the process of removing unneeded headers. You `bar.h` doesn't need `baz.h`, so you delete the extraneous `#include "baz.h"` from `bar.h`. You have just broken any code that piggybacked on that extraneous `#include`. The solution is not to rely on such piggybacks. If your file uses some functionality defined elsewhere, `#include` the file that defines that functionality. Don't let some other header do that `#include` for you. – David Hammen Aug 21 '11 at 20:29
  • @DavidHammen, if I don't want to rely on include piggybacks, then isn't a tool like cppclean the way to do it? Yes, if I remove some header that's unnecessary, it'll break a lot of stuff, but really, those files shouldn't have included the piggyback in the first place. So, I go and fix those files that relied on that include. If I truly want better dependency management, that's what I should be doing anyway. I'm confused, because you say piggybacks are bad, but then you say not to use a tool that forces me to eliminate them. – Chance Oct 23 '12 at 17:46
  • I don't think you understand the problem. Automated tools sometimes miss the obvious, sometimes go too far. They give false positives and false negatives. You can use such tools, but always take them with a grain of salt. – David Hammen Oct 23 '12 at 18:22
  • This project seems to have disappeared (no source here: http://code.google.com/p/cppclean/source/checkout) and that seems to be a known issue: http://code.google.com/p/cppclean/issues/detail?id=3#makechanges – David Doria Nov 13 '12 at 21:49
  • As David said, its a known issue but still available via svn: `svn checkout http://cppclean.googlecode.com/svn/trunk/ cppclean-read-only` – math Mar 27 '13 at 16:39
  • As of today, the SVN does not respond anymore either. – antipattern Aug 08 '17 at 10:02
3

If you are using Eclipse CDT you can try Includator which is free for beta testers (at the time of this writing) and automatically removes superfluous #includes or adds missing ones.

Disclaimer: I work for the company that develops Includator and have been using it for the past few months. It works quite well for me, so give it a try :-)

Mirko Stocker
  • 2,232
  • 16
  • 22
1

As far as I know, there isn't one (that isn't PC-Lint), which is a shame, and surprising. I've seen the suggestion to do this bit of pseudocode (which is basically automating your "painstaking process":

for every cpp file
for every header include
comment out the include
compile the cpp file
if( compile_errors )
un-comment out the header
else
remove header include from cpp

Put that in a nightly cron, and it should do the job, keeping the projcet in question free of unused headers (you can always run it manually, obviously, but it'll take a long time to execute). Only problem is when not including a header doesn't generate an error, but still produces code.

Cinder6
  • 566
  • 2
  • 10
  • 2
    That still unfortunately doesn't clean up headers that include other headers that aren't required (and worse, may cause some "programming by coincidence" in other implementation files that get the headers they need through other headers that actually don't need them). It at least reduces the number of spurious includes in cpp files, but I would like to eliminate them in other headers as well. – Nick Bastin Aug 19 '09 at 19:06
  • 3
    It's also unadvisable to remove every header. Consider #include and #include . In some implementations of vector algorithm will be included, but that isn't guaranteed. Robust code should include both (if their both used). Your described method could remove #include depending on the implementation of vector. – deft_code Aug 20 '09 at 01:26
  • This is true. Nick, are you more concerned with local header files (or do you at least have a lot of them)? If so, you could modify the above algorithm to not mess with library headers, and tune those manually. It's a pain, but it would cut the work down, at least. – Cinder6 Aug 20 '09 at 01:44
1

I've done this manually and its worth it in the short (Oh, is it the long term? - It takes a long time) term due to reduced compile time:

  1. Less headers to parse for each cpp file.
  2. Less dependencies - the whole world doesn't need re-compiling after a change to one header.

Its also a recursive process - each header file that stays in needs examining to see if any header files it includes can be removed. Plus sometimes you can substitute forward declarations for header includes.

Then the whole process needs repeating every few months/year to keep on top of leftover headers.

Actually, I'm a bit annoyed with C++ compilers, they should be able to tell you what's not needed - the Microsoft compiler can tell you when a change to a header file can be safely ignored during compilation.

quamrana
  • 37,849
  • 12
  • 53
  • 71
0

If someone is interested, I just putted on sourceforge a small Java comand-line tool for doing exactly that. As it is written in Java, it is obviously runable on linux.

The link for the project is https://sourceforge.net/projects/chksem/files/chksem-1.0/

-1

Most approaches for removing unused includes work better if you first make sure that each your header files compiles on its own. I did this relatively quickly as follows (apologies for typos -- I am typing this at home:

find . -name '*.h' -exec makeIncluder.sh {} \;

where makeIncluder.sh contains:

#!/bin/sh
echo "#include \"$1\"" > $1.cpp

For each file ./subdir/classname.h, this approach creates a file called ./subdir/classname.h.cpp containing the line

#include "./subdir/classname.h"

If your makefile in the . directory compiles all cpp files and contains -I. , then just recompiling will test that every include file can compile on its own. Compile in your favorite IDE with goto-error, and fix the errors.

When you're done, find . -name '*.h.cpp' -exec rm {} \;

  • 1
    I fail to see how this helps. Even if some of my headers didn't compile on their own (which isn't the case anyhow), this wouldn't provide any additional insight into the unused ones. – Nick Bastin Sep 19 '13 at 08:55