72

The only real use of the --whole-archive linker option that I have seen is in creating shared libraries from static ones. Recently I came across Makefile(s) which always use this option when linking with in house static libraries. This of course causes the executables to unnecessarily pull in unreferenced object code. My reaction to this was that this is plain wrong, am I missing something here ?

The second question I have has to do with something I read regarding the whole-archive option but couldn't quite parse. Something to the effect that --whole-archive option should be used while linking with a static library if the executable also links with a shared library which in turn has (in part) the same object code as the static library. That is the shared library and the static library have overlap in terms of object code. Using this option would force all symbols(regardless of use) to be resolved in the executable. This is supposed to avoid object code duplication. This is confusing, if a symbol is refereed in the program it must be resolved uniquely at link time, what is this business about duplication ? (Forgive me if this paragraph is not quite the epitome of clarity)

Thanks

5 Answers5

103

There are legitimate uses of --whole-archive when linking executable with static libraries. One example is building C++ code, where global instances "register" themselves in their constructors (warning: untested code):

handlers.h

typedef void (*handler)(const char *data);
void register_handler(const char *protocol, handler h);
handler get_handler(const char *protocol);

handlers.cc (part of libhandlers.a)

typedef map<const char*, handler> HandlerMap;
HandlerMap m;
void register_handler(const char *protocol, handler h) {
   m[protocol] = h;
}
handler get_handler(const char *protocol) {
   HandlerMap::iterator it = m.find(protocol);
   if (it == m.end()) return nullptr;
   return it->second;
}

http.cc (part of libhttp.a)

#include <handlers.h>
class HttpHandler {
    HttpHandler() { register_handler("http", &handle_http); }
    static void handle_http(const char *) { /* whatever */ }
};
HttpHandler h; // registers itself with main!

main.cc

#include <handlers.h>
int main(int argc, char *argv[])
{
    for (int i = 1; i < argc-1; i+= 2) {
        handler h = get_handler(argv[i]);
        if (h != nullptr) h(argv[i+1]);
    }
}

Note that there are no symbols in http.cc that main.cc needs. If you link this as

g++ main.cc -lhttp -lhandlers

you will not get an http handler linked into the main executable, and will not be able to call handle_http(). Contrast this with what happens when you link as:

g++ main.cc -Wl,--whole-archive -lhttp -Wl,--no-whole-archive -lhandlers

The same "self registration" style is also possible in plain-C, e.g. with the __attribute__((constructor)) GNU extension.

NearHuscarl
  • 66,950
  • 18
  • 261
  • 230
Employed Russian
  • 199,314
  • 34
  • 295
  • 362
  • 3
    Russion If libhttp.a can be built then it proves that register_handler function existed in that libhttp.a. So how can this function refer to register_handler in main.cc? So this case we must use some another way to implement your idea. – longbkit Oct 21 '11 at 10:01
  • 2
    @longbkit I've updated the answer so that handlers is factored out into a lower-level library, as it would need to be. I resisted the temptation to change the `handler` type from a C function pointer to a C++ `std::function`" – Arthur Tacca Sep 10 '19 at 10:18
  • Is there any way to target whole archive functionality to specific global registration symbols, instead of an entire library? – David Jul 28 '20 at 12:25
  • 2
    @David If you want to pull just a specific set of symbols, but not the whole archive, `--whole-archive` is (obviously) not appropriate. Use `-Wl,-u,needed_symbol` instead. – Employed Russian Jul 28 '20 at 14:38
11

Another legitimate use for --whole-archive is for toolkit developers to distribute libraries containing multiple features in a single static library. In this case, the provider has no idea what parts of the library will be used by the consumer and therefore must include everything.

Steve Brooks
  • 134
  • 1
  • 5
  • 7
    static libraries include everything without use of --whole-archive, this seems like a rather pointless thing to be doing – WinterMute Jan 23 '19 at 12:41
  • 3
    Absolutely incorrect. When you make a library, whether static or dynamic, it will contain ALL named object files. – Swiss Frank Sep 29 '20 at 04:12
7

An additional good scenario in which --whole-archive is well-used is when dealing with static libraries and incremental linking.

Let us suppose that:

  1. libA implements the a() and b() functions.
  2. Some portion of the program has to be linked against libA only, e.g. due to some function wrapping using --wrap (a classical example is malloc)
  3. libC implements the c() functions and uses a()
  4. the final program uses a() and c()

Incremental linking steps could be:

ld -r -o step1.o module1.o --wrap malloc --whole-archive -lA
ld -r -o step2.o step1.o module2.o --whole-archive -lC
cc step3.o module3.o -o program

Failing to insert --whole-archive would strip function c() which is anyhow used by program, preventing the correct compilation process.

Of course, this is a particular corner case in which incremental linking must be done to avoid wrapping all calls to malloc in all modules, but is a case which is successfully supported by --whole-archive.

ilpelle
  • 470
  • 5
  • 12
7

I agree that using —whole-archive to build executables is probably not what you want (due to linking in unneeded code and creating bloated software). If they had a good reason to do so they should have documented it in the build system, as now you are left to guessing.

As to your second part of the question. If an executable links both a static library and a dynamic library that has (in part) the same object code as the static library then the —whole-archive will ensure that at link time the code from the static library is preferred. This is usually what you want when you do static linking.

lothar
  • 19,853
  • 5
  • 45
  • 59
  • one reason to use ```—whole-archive```: This is normally used to turn an archive file(.o/.a) into a shared library, forcing every object to be included in the resulting shared library, in order to combine multiply dynamic libraries when compile static lib at first. ref: [Using LD, the GNU linker - Options](https://ftp.gnu.org/old-gnu/Manuals/ld-2.9.1/html_node/ld_3.html) – Kevin Chou Apr 23 '21 at 03:34
3

Old query, but on your first question ("Why"), I've seen --whole-archive used for in-house libraries as well, primarily to sidestep circular references between those libraries. It tends to hide poor architecture of the libraries, so I'd not recommend it. However it's a fast way of getting a quick trial working.

For your second query, if the same symbol was present in a shared object and a static library, the linker will satisfy the reference with whichever library it meets first.
If the shared library and static library have an exact sharing of code, this may all just work. But where the shared library and the static library have different implementations of the same symbols, your program will still compile but will behave differently based on the order of libraries.

Forcing all symbols to be loaded from the static library is one way of removing confusion as to what is loaded from where. But in general this sounds like solving the wrong problem; you mostly won't want the same symbols in different libraries.