3

First of all, let me clarify my question is totally unrelated to Is there any benefit to passing all source files at once to a compiler?.

What I'd like to know here is whether my build-times would become faster if i made:

  • a) Many number of calls to the compiler with one single file each

  • b) Few number of calls to the compiler with many files each

  • c) Moderate number of calls to the compiler with a bunch of files each

Why am I asking this? I'm in the middle of creating a ninja generator (similar to the bunch here and I'd like to know what's the best way to create the dependency DAG.

You could claim spawning less subprocesses will be typically faster and less expensive but i'd like to know whether the gains would be worth or not so I can design my tool properly.

Just to clarify a little bit further, here's a simple example, as you can see in that build.ninja file there are many build statements with a 1:1 dependency per compiler call, so... could the build times improved by grouping few source files on single build statements?

EDIT: Also, I guess this Why is creating a new process more expensive on Windows than Linux? can provide some insights for the current topic

BPL
  • 9,632
  • 9
  • 59
  • 117
  • What about c) a moderate number of calls to the compiler with a bunch of files each? You don't say anything about your compiler or build hardware. It probably depends on number of cores, amount of RAM, kinds of disks and possibly network speed. – Bo Persson Nov 15 '17 at 14:30
  • @BoPersson Edited the question. As for the configuration I'll be using over here to test will be mostly visual studio cl compiler (vs2015/vs2017). I understand cpu+ram+disk+network are extremely important factors. But let's take those out of the equation, as my question intends to figure out what's the "best" way to produce optimal ninja files. Said otherwise, is there any major difference when picking up a/b/c or it doesn't matter at all? It'd be interesting to know which approach is used by large projects like chromium. That would give me some insights about real-world use-cases – BPL Nov 15 '17 at 15:05
  • 1
    Maybe you can measure it. Record the actual compiler invocations from VS, and with some text editing, you can modify it to simulate the behavior you mention. I've checked a project: 300 files with GCC under linux. The difference between a) and "one call to the compiler with all the files" is a) is 0.2% slower. – geza Nov 20 '17 at 21:14
  • @geza Interesting! are these 300 files from a real world-case scenario project or just some files created procedurally to make the experiment? Did you try cold build in both cases? Anyway, you're right, making a test with some real project editing the compiler output to merge them into "one call to the compiler with all the files" sounds like a good idea. Btw, on windows there is a limitation where you can just use 8kb or 32kb(unicode) as command line arguments, guess on linux there isn't exist that limitation? I'd have to spawn them like batches as i think vs compiler doesn't read from stdin – BPL Nov 20 '17 at 21:22
  • It's a real world project. I tried it with warm disk-caches. But I don't think that there would be a significant difference (at least, for my case). Source code is small. There is some limit in linux too, but I think it's much higher than 8KB. For my case, command line was ~15KB, it had no problems. – geza Nov 20 '17 at 21:35

1 Answers1

2

It depends. (But you knew that was going to be the answer to a "is it faster to ..." question, didn't you?)

Using MSVC it is much faster to compile all the files with a single call to the compiler. It's not just that process creation is slow on Windows, the compiler also takes quite a long time to initialize itself. (Of course, the speed-up going from twenty-at-a-time to all-at-once will probably be much less than going from one-at-a-time to twenty-at-a-time.)

Using gcc/clang on Linux, it is probably faster to compile one file at a time. This is not so much because gcc is faster like this, but because you can use ccache (ccache will only optimize if given a single file to compile), and ccache makes compilation much faster. (Obviously, if all your files are different every time - because they are generated with unique content - ccache won't help.) The same applies to mingw on Windows.

  • Interesting! so it seems is worth my metabuild system would generate ninja files depending on os/compiler. So, you've already clarified than on windows/MSCV this is definitely a big win to consider but as you know on windows CreateProcess lpCmdLine can't be bigger than 32kb, which means you could only batch few files per compiler call, any workaround about it? As far as i know MSVC can't read from stdin a file listing, right? Finally, do you know why big projects like chromium (on windows) are using multiple cl calls per file... wouldn't they benefit the other way around? – BPL Nov 24 '17 at 15:35
  • As for my question, even if i've asked the typical "is it faster to... question" i didn't want it to be a very generic one, as basically I'm only interested on the major platforms/ compilers. Your question addresses those ones, so if nobody brings to the table anything more elaborated this will be probably be marked as valid/bountied one, +1 in the meantime ;) . Also, about [createProcess](https://msdn.microsoft.com/en-us/library/windows/desktop/ms682425(v=vs.85).aspx) 32kb limitation... – BPL Nov 24 '17 at 15:37
  • One more reason for preferring separate compiles would be that ninja parallelizes independent build steps by default. So even if you spend more CPU time, wall clock time might be lower for a more fine-granular split on machines with many cores. Also more fine granular builds should allow for more fine granular dependencies, potentially lessening the number of recompilations necessary. – PaulR Nov 24 '17 at 16:49