2

I am working on a c++ codebase using CMake with 500k+ lines of code. One problem I realize is that the (incremental) build speed is really slow.

Partly because the dependencies graph is not fine grained enough. E.g. many targets from target_link_libraries have unnecessary dependencies, e.g. linking(dependent on) objects(libraries) for a small part of the libraries but also containing many other unnecessary functions/classes to the target. So when a small change is made to to those dependent files, a lot of targets not needing the change get compiled again.

I thought about refactoring all the CMakeLists.txt such that each add_library is used for one file only and target_link_libraries only link necessary libraries. This way of refactoring is too time-consuming, not automatic enough and error-prone.

Can Bazel, Scons or other build system solve this problem?

related question: How to speed up Compile Time of my CMake enabled C++ Project?

cpchung
  • 774
  • 3
  • 8
  • 23
  • "Can Bazel or Scons solve this problem?" - May be.. or may be not. It depends from **many factors**, and "500k+ lines of code" is just one of them. Without trying a specific project one cannot be sure which build system is the most effective for it. – Tsyvarev Apr 10 '22 at 09:12
  • So if I understand correctly, you want whichever build system you choose to know all the actual symbol dependencies and what provides them and then build the link sources automatically from this? (assuming you want the system to automatically change when you change which symbols are needed by your programs) Is that what you're asking for? – bdbaddog Apr 11 '22 at 02:37
  • @bdbaddog yes. Ideally I do not need to manually specify the needed dependencies because users could add unecessary dependencies or previous dependencies become obsolete with code changes. Maybe something similar to maven from the java world. – cpchung Apr 11 '22 at 16:53
  • Doing that would be VERY computationally and file I/O expensive. Another option would be to just link against the objects needed and skip the libraries altogether.. Though linker would still pull in every object listed. Yet another option is to build a static library, but then you'll have an issue with the ordering of such and their interdependencies.. So to summarize, I don't think any current build system for C/C++ will do this out of the box. You might write logic to automate the search for actual dependencies by dropping libraries until the link fails.. also compute & I/O costly – bdbaddog Apr 11 '22 at 17:25
  • Maybe this would point you in the right direction? https://stackoverflow.com/questions/8025766/makefile-auto-dependency-generation – bdbaddog Apr 11 '22 at 17:26
  • 1
    MongoDB has been talking about this concept, and we have come up with some discussions using header files to determine dependencies. If the cpp source file includes a header file, we can for a majority of the cases determine what library is needed, there may be a few manual exceptions to maintain. We discussed pairing this with IWYU (symbol resolution) so that automatically only the dependencies needed are calculated dynamically via a SCons scanner. Note the IWYU tool would be pre commit hook for changeset, so the symbol resolution is automatically maintained, so there is minimal dynamic cost. – dmoody256 Apr 11 '22 at 20:12
  • @dmoody256 looks like this IWYU is still in early-stage. It can remove extra #include for each file but it is not smart enough to prune the cmake file. – cpchung Apr 12 '22 at 02:30
  • 1
    @cpchung IWYU doesn't need to touch build scripts. SCons already parses the source files for #include lines, and then can use that info to find related cpp files (mongodb codebase makes cpp and header files with matching basename) which via SCons can find the library they are compiled into. Therefore, from the includes we can get a true dependency list related to the actual symbols in the source. IWYU makes sure there are no unecessary includes and by extension dependencies. – dmoody256 Apr 12 '22 at 04:52

1 Answers1

0

Here are some things I used while moving 6M LOC codebase to CMake:

  1. Use LINK_DEPENDS_NO_SHARED (doc). You don't need to relink every dependent target if you only changed .cpp files.
  2. Use PUBLIC, PRIVATE and INTERFACE keywords correctly.
  3. Hide your symbols to improve link times.
set(CMAKE_CXX_VISIBILITY_PRESET hidden)
set(CMAKE_VISIBILITY_INLINES_HIDDEN ON)
  1. Build only subset of needed targets. This is applicable if you have some kind of plugin system and don't need all of plugins to be available at once.

Remember that if you change header files you MUST rebuild every dependent target, otherwise you risk may have ABI incompatibility. PIMPL idiom might help you there.

Osyotr
  • 784
  • 1
  • 7
  • 20