0

We have been working on our project with scons as build system for a few years. Scons is awesome and has improved our development process a lot. There are ~15000 C++ source files initially and as the project evolves source files in other languages(Java/Scala) are added into the code base as well. Then we developed our own builders which manage dependencies and invoke the external tool(javac, scalac) to build those sources, which works really well. Recently I was working on the optimization of current build system and found performance difference between C++ and Java build process:

Environment setup: Build server with 24 cores, Python 2.7.3, Scons 2.2.0
Command: scons --duplicate=soft-copy -j8

When building java code, CPU usage is easily high observed from top and spanning multiple cores: Java Build

However, when building C++ code, CPU usage is always ~4% and running only on 1 core no matter how many jobs in scons: C++ Build

I've been googling a lot on the internet but could not find something useful. Am I hitting the GIL issue in Python? I believe that each gcc/g++ command should be ran in a separate sub-process just like javac in our own builders, so there should not be GIL here. Is there any workaround to fully utilize multiple cores speeding up C++ build further? Thanks in advance.

muneebShabbir
  • 2,500
  • 4
  • 29
  • 46
WindLeeWT
  • 25
  • 3
  • You might need to set the `-j` flag: http://stackoverflow.com/questions/414714/compiling-with-g-using-multiple-cores – doctorlove Nov 08 '16 at 12:37
  • Is it possible that your implicit dependencies for the C++ part (headers and lib) simply don't allow the build steps to be parallelized further? SCons itself shouldn't have any problem with larger builds or parallel processing (see e.g. [Why SCons Is Not Slow](https://bitbucket.org/scons/scons/wiki/WhySconsIsNotSlow)) and its CPU measures show a full 100% in your screenshot. SCons might be idly spinning, trying to find new tasks to spawn, but the current dependencies forbid this. Other large projects (like MongoDB) don't seem to have a problem in this area.... ;) – dirkbaechle Nov 08 '16 at 13:29
  • @dirkbaechle Thanks for your reply, that's a reasonable guess that the width of DAG is at most N (N < 8). But I'm still wondering that there are hundreds of targets to be built and among them there are plenty of programs which are definitely the end point of the dependency graph. However, cpu usage is still ~100% **only on 1 core**.... By constrast, Java building are running heavily on several cores. – WindLeeWT Nov 08 '16 at 16:00
  • @dirkbaechle We wrote our own tool to generate a single SConstruct(~50MB) in the root of the workspace. Is that a possible cause? I increased the job number to 12 but the build time is still the same. When I build Java/Scala code alone, a higher job number really reduce the build time a bit. – WindLeeWT Nov 08 '16 at 16:08
  • When you say "build" are we talking about a "build from scratch" or an "incremental build with no changes" (null build)? – dirkbaechle Nov 08 '16 at 16:11
  • @WindLeeWT - Why would you generate a single huge SConstruct? It's really hard to judge without seeing the contents of your SConstruct and/or the output of --tree=prune. Any chance you can make a small(er) testcase to demonstrate the issue? As Dirk said.. many projects have no problem maxing out the CPUs during the build phase. So you're issue is very unusual – bdbaddog Nov 08 '16 at 17:00
  • @dirkbaechle Today I moved onto another build server and launched a clean build and then I found that several cores are ~90% running 'cc1plus -o ...' which should be the action of compiling C++ source code. Suddenly I remembered that the previous build server was configured with ccache for best performance and it seems that most of cpu usage during C++ build was took by compilation stage which was fully avoided due to ccache. That's why I didn't see cpu usage on top... Thanks for the help. – WindLeeWT Nov 09 '16 at 12:14
  • @bdbaddog Thanks for your help. Please see the comment above. Btw, is there any negative effect of a single huge SConstruct for the whole source tree within a workspace(~10000 C++ source files)? – WindLeeWT Nov 09 '16 at 12:16
  • @WindLeeWT I can't think of any particular negative effect. (Though the file would be pretty unreadable). Why do it? – bdbaddog Nov 09 '16 at 23:38
  • @bdbaddog We built a simple build system on top of scons so that a developer is able to describe target/dependencies more easily. Then a single SConstruct is generated automatically with all targets/sources starting from the root of the workspace and deleted at the end of build. – WindLeeWT Nov 10 '16 at 02:33
  • Just a thought.. have the scons logic read the source file and dynamically build the dependency tree.. It is python. – bdbaddog Nov 11 '16 at 22:31

1 Answers1

0

As WindLeeWT explained in one of his comments, the cause for the observed behaviour was that ccache was installed and configured on the server in question. It seems that most of the CPU usage during a C++ build was took by the compilation stage, which was fully avoided due to ccache. That's why no CPU usage for several cores could be seen within top.

As soon as they launched a "build from scratch" on another server without ccache, several cores were running at 90% with 'cc1plus -o ...' as expected.

No performance penalties (GIL etc.) were involved, and neither Python nor SCons degraded performance in any significant way.

dirkbaechle
  • 3,984
  • 14
  • 17
  • Thanks dirk. Sorry that I was a bit busy recently and forgot to update the question with our discussion and the cause of the problem – WindLeeWT Dec 17 '16 at 09:35