link-time optimization versus. project inlining; limitations on each approach

Question

usually people when designing proper software architectures on c++ that also need to have great performance, enter into the dangerous game of premature optimization, but rather that doing optimization at the architecture level (which is a perfectly good and encouraged form of premature optimization) they do compromises at the code level, like avoiding virtual methods and interfaces altogether, low level hacks, etc.

some people avoids this by doing a practice called usually application inlining or unity builds which is basically generating one or two really big .cpp with all the headers and .cpp from the whole project included, and then compile it as a single translation unit. This approach is very reliable when it comes to inlining virtual methods (devirtualization) since the compiler does have everything to make the required optimizations

Question what drawbacks does have this approach regarding more "elegant & modern" methods like link-time optimization?

please explain why they are wrong, that is what the question is about. Thanks! — lurscher, Jul 14 '11 at 21:54
possible duplicate of [(c++) The benefits / disadvantages of unity builds?](http://stackoverflow.com/questions/847974/c-the-benefits-disadvantages-of-unity-builds) — Fred Foo, Jul 14 '11 at 22:06
just found this for automation of the generated .cpp with CMake: http://cheind.wordpress.com/2009/12/10/reducing-compilation-time-unity-builds/ — lurscher, Jul 14 '11 at 22:17

score 4 · Accepted Answer · edited May 23 '17 at 10:30

4

The technical name, approaching minor buzzword status, for that approach is unity build.

See for example:

The benefits / disadvantages of unity builds?

The downside is best described here:

http://leewinder.co.uk/blog/?p=394

The short version is it is more or less a choice of languages: you either write in regular-C++ or Unified-build-C++. The 'correct' way of writing virtually any code will differ between the two.

edited May 23 '17 at 10:30

Community

1
1

answered Jul 14 '11 at 22:05

soru

5,464
26
30

1

ok, so it seems its main drawback its mantainability - and its not hard to imagine, but i still think there is room for automation of the ALL.cpp generation – lurscher Jul 14 '11 at 22:13

score 1 · Answer 2 · answered Jul 14 '11 at 22:04

1

For a large project, the technique of having "all files in one" potentially increases the build time, though that only matters to the developers. With multiple smaller files, one code change usually causes less code to compile, therefore it should be a faster incremental build. If changes are more often to header files or other components which many files are dependent upon, then the single file approach saves build time since header files need not be processed by multiple dependents when changed.
There are a surprising number of compilers which cannot process very large source files, even those which are widely used and have popularity far in disproportion to their quality. Such compilers tend to choke at random, report errors for syntactically correct code, or generate incorrect instructions.
A project with multiple developers and old version control tools might have trouble coordinating changes to the limited number of modules.

answered Jul 14 '11 at 22:04

wallyk

56,922
16
83
148

2

this practice is not for evelopment! is only for doing optimal release builds. The project is still split in several .cpp, its only the final build which is done in this way (i should edit my post to make that point a bit clearer) You have a point on compilers choking on big source files though, but it can always be solved by splitting it into more .cpp files each one smaller, and still reap the rewards of out-of-the-box devirtualization – lurscher Jul 14 '11 at 22:06
3

Doing unity builds for release rules out things like anonymous namespaces, local helper functions, and file static variables. Now everything is global! Can cause interesting new overload resolutions. – Bo Persson Jul 14 '11 at 22:42
1

If you have two helper functions with different roles and same name, is that not confusing anyway. – hiddensunset4 Aug 04 '11 at 07:50

score 1 · Answer 3 · edited Dec 20 '16 at 09:23

1

One obvious drawback is potential clash between static or local symbols:

// File a.cpp
namespace
{
    void f() { make_peace(); }
}

void action() { f(); }

// file b.cpp
void f() { launch_missiles(); }

void action2() { f(); }

If b.cpp is included before a.cpp, Bad Things happen.

Another drawback is that compilers (MSVC ?) may not cope very well with large files.

Also, every time you change a tiny bit of your code, you're going for full compilation.

In conclusion, I'd never do that. It is not worth the extra bucks for a SSD* or a more powerful build machine. And Link time code generation is usually good enough.

The rare benefits I see is enhanced virtual function analysis, and better alias analysis, but I really don't think it is worth it. You can use restrict or its variants for the latter, and the former is very rare to have a real impact on your program's performance.

* If compilation times really bother you, try to replace hard disks with SSDs on your build machines. Magic.

edited Dec 20 '16 at 09:23

user1976

353
4
15

answered Jul 14 '11 at 22:32

Alexandre C.

55,948
11
128
197

good points, add to that that using clauses should not be at global scope for the same reasons; about MSVC, yes, but for compilers with coping problems you can always do a partial-unitization in 2 or more big (but not so big) .cpp – lurscher Jul 15 '11 at 01:53
@lurscher: you never use `using`, do you ? – Alexandre C. Jul 15 '11 at 07:06
yes, if you do this, you have to be like Steely Dan: no static at all. – soru Jul 15 '11 at 10:48
1

@Alexandre, afaik `using` will still be OK to use as long as it is in either class-scope or function-scope. @soru, what is the problem with `static`? – lurscher Jul 15 '11 at 13:11
OK, static in the C sense of file scope, or the more-or-less equivalent C++ anon namespaces: other uses of static won't affected. But it's a bit much to expect a 70's smooth rock back to spell out the exact language semantics... – soru Jul 15 '11 at 22:31
1

@lurscher: `using` at class scope may be ok (and sometimes necessary), although I take it as a sign of poor design. At function scope, it can have nasty effects, especially when ADL is thrown into the mix: the only `using` which is really unavoidable (this issue has been cleared in C++0x) is `using std::swap` at function scope, so that you can look up your own `swap` function via ADL if it is not possible to specialize `std::swap`. – Alexandre C. Jul 24 '11 at 21:07

score 1 · Answer 4 · answered Jul 15 '11 at 01:58

Just think about it for a minute. There's some part of your program, call it A, and without any optimization it takes time T, but with optimization it takes, say, T/2.

Now there are other parts of your program also, so let's think about what effect A's optimization has on the whole program.

If the time originally spent in A was none (0%) then what is the effect of A's optimization? Zero.

If the time originally spent in A was all (100%) then what is the effect of A's optimization? T/2

If the time originally spent in A was half (50%) then the effect of A's optimization is T/4.

So the benefit of optimizing a piece of code is proportional to the amount of time originally spent in that code.

So if one wants inlining, avoidance of virtual functions, etc. to have significant benefit, what kind of code does it have to be?

It has to be code that, before optimization, contained the program counter (exclusive time) a significant fraction of the time.

In significant applications, containing many layers of function/method calls, where the call tree nearly always dips down into new, or I/O, or string libraries, or data structure libraries, or database libraries, that are outside the application's source code, what percent of total time is exclusive time in the code compiled as part of the application?

Often (not always) from little to very little. And the potential benefit from inlining or other compiler/linker optimization is proportional to that.

i posed the example of virtual vs. inline calls because its sort of representative of the gains of inter-process optimizations that are not possible with the traditional compilation model, but certainly its not the only benefit you'll get from it — lurscher, Jul 15 '11 at 03:42

link-time optimization versus. project inlining; limitations on each approach

4 Answers4