1

I would like to remove some unused headers from the large codebase. I know that there are some open-source tools but number of false positives is too big. The idea was to run script for each file. Consequently, remove include, tries to compile, if compile goes to next line (with include removed), otherwise leave include and go to the next line.

Is there any problems connected with this idea, looking in the long-term? Or is there any easier option to try?

πάντα ῥεῖ
  • 1
  • 13
  • 116
  • 190
Newbie
  • 462
  • 7
  • 15
  • 1
    If the intent is to speed up build times, why not use precompiled headers? – Govind Parmar Feb 21 '19 at 17:36
  • The intent is to remove huge mess with includes. – Newbie Feb 21 '19 at 17:37
  • 2
    Have you tried [Clang include fixer](https://clang.llvm.org/extra/include-fixer.html) or [Include What You Use](https://github.com/include-what-you-use/include-what-you-use)? – tkausl Feb 21 '19 at 17:37
  • 1
    Yes, a lot of false positives. – Newbie Feb 21 '19 at 17:37
  • 2
    In some cases, removing include might still compile but produces different code or even does program ill-formed NDR... :-/ – Jarod42 Feb 21 '19 at 17:37
  • @Jarod42 That is why I am asking, could you say something more? – Newbie Feb 21 '19 at 17:38
  • 4
    I wouldn't use this method. You could have something like `#include #include int main() { std::cout << std::string{"test"}; }` and your script could remove `#include ` but on some platforms the code will still compile. The code is no longer correct, but since it compiles you'll continue on. – NathanOliver Feb 21 '19 at 17:40
  • @NathanOliver Sorry for maybe stupid question, how it could work on different platform? – Newbie Feb 21 '19 at 17:42
  • 1
    @Newbie Because of a different compiler implementation (e.g. g++ vs msvc++). – πάντα ῥεῖ Feb 21 '19 at 17:42
  • 1
    In some implementations `` includes ``, which is not required. – Jarod42 Feb 21 '19 at 17:43
  • I know which compiler is used in prod and will be the same which I use for compilation. – Newbie Feb 21 '19 at 17:44
  • They might change implementation with version upgrade though. – Jarod42 Feb 21 '19 at 17:45
  • @Newbie And if your compiler is one of the ones that does something like that you'll remove headers that you shouldn't. Just because it compiles on your platform doesn't mean you should do it though. If you ever upgrade and they change how the headers are structured then you'll have to do the opposite process which won't be fun. – NathanOliver Feb 21 '19 at 17:46
  • 2
    Related to [tools-to-find-included-headers-which-are-unused](https://stackoverflow.com/questions/1301850/tools-to-find-included-headers-which-are-unused) – Jarod42 Feb 21 '19 at 17:48
  • So one of the requirements is to compile with multiple compilers? – Newbie Feb 21 '19 at 17:48
  • 1
    That is something you should always strive for. Portable code is often correct code. It also make upgrading/changing the compiler a lot easier. – NathanOliver Feb 21 '19 at 17:49
  • 1
    @Newbie _"So one of the requirements is to compile with multiple compilers?"_ If you want to keep your code portable standard c++ code, yes. But if you won't care about the standard header dependencies and only your own header dependencies a GNU make buuild system could help to generate that information for you. – πάντα ῥεῖ Feb 21 '19 at 17:50

1 Answers1

3

Is there any problems connected with this idea

Yes. A file can compile successfully even if it has missing include files, so this can have false positives, and may remove headers that are actually used.

It is quite a difficult task to analyse which headers should be included, and which are unnecessary, both manually and automatically. Tools have been made to do the checking automatically. Even if "number of false positives is too big", it's still (in my experience) a small fraction of all included headers, so it is far less work to check results of such tool than compare includes of every file to the entire content of those files. Even the script that you suggest can be better than nothing, as long as you don't remove the includes without manual checking.

It helps manual checking to make the files as small as possible. As a side-effect, this also makes incremental compilation much faster (but compilation form scratch slower).

eerorika
  • 232,697
  • 12
  • 197
  • 326
  • My first thought was if one could do something clever with the clang frontend, but does the frontend support hooking in the preprocessing phase? – πάντα ῥεῖ Feb 21 '19 at 18:15
  • @πάνταῥεῖ I'm not very familiar with clang frontend. I'm assuming that (and something clever) is what the existing tools use. – eerorika Feb 21 '19 at 18:19
  • Sure, if you know where the headers are included, and at the same time a symbol table of the translation units and what's actually declared in the header files as an AST, you're almost done with the job ;-) ... – πάντα ῥεῖ Feb 21 '19 at 18:22
  • Even if you can hook the preprocessing phase, the compiler can't distinguish between two headers that (through whatever mechanism) declare the same name. The issue is which header is **required** to declare the name, not which ones incidentally declare it. That comes from the language definition, not from code analysis. – Pete Becker Feb 21 '19 at 18:22
  • @PeteBecker: Modules should help in that regards :-) – Jarod42 Feb 21 '19 at 19:16