5

With static compilation, only the functions of a library which are actually needed by a program are linked to the program. How is that with shared libraries ? Are only the functions actually needed by a program are loaded into memory by the dynamic linker, or is always the total shared library loaded ? If it is functions, how could I get the actual size of a program including its loaded functions during runtime ?

Thank You ! Oliver

Oliver
  • 51
  • 1
  • Good question... I have NO IDEA what the answer is though, because I've never needed to think about it... which poses the counter question "Why do you think this is important?" ;-) Ergo: What's you actual problem? Give us some context. – corlettk Apr 30 '11 at 05:29
  • 2
    If the libraries are shared, they might be used by several programs at the same time. How do you count that? – Bo Persson Apr 30 '11 at 05:29
  • 1
    When a shared lib is loaded into memory does it add size to your executable? Who knows? The shared library could be loaded into memory and re-used (in the same location) by all applications. Even if you find an answer it does not mean that it will hold in the future each OS can change and optimize it handling of shared libraries. This is so far outside the specification of any language it lands solely in the relm of the OS. – Martin York Apr 30 '11 at 05:38
  • @mu is too short - Some linkers will optimize out unused functions. – Chris Lutz Apr 30 '11 at 05:47
  • @mu is too short: No, I guess you are wrong considering g++ and ld and I am guessing that from my own experience. Without special options linker will slice out unused symbols. Check [this](http://stackoverflow.com/questions/5685617/missing-symbols-from-static-library-in-linked-executable) question for more info. – beduin Apr 30 '11 at 05:48
  • Thank you all for your comments. I'm interested in this because of my interest in optimized embedded Linux systems. If I understand mu correctly, then large object files of a static library would actually be bad, because potentially a lot of functions would be part of the executable but never used ? And yes, you Bo and Martin are right of course that the questions of size would not make sense in case of a library shared by multiple processes. Let's assume there is just one such process. – Oliver Apr 30 '11 at 05:56
  • Okay, I stand corrected. Historically, linkers were dumb and hence the old school "one function per file" hack. Lucky for me that I haven't had to statically link anything in over a decade. – mu is too short Apr 30 '11 at 05:56

2 Answers2

8

With static compilation, only the functions of a library which are actually needed by a program are linked to the program. How is that with shared libraries ?

Shared libraries are referenced by the program symbolically, that is, the program will identify, by name, the shared library it was linked with.

Are only the functions actually needed by a program are loaded into memory by the dynamic linker, or is always the total shared library loaded ?

The program will reference specific entry points and data objects in the shared library. The shared library will be mapped into memory as a single large object, but only the pages that are actually referenced will be paged in by the kernel. The total amount of the library that gets loaded will depend on both the density of references, references by other images linked to it, and by the locality of the library's own functionality.

If it is functions, how could I get the actual size of a program including its loaded functions during runtime ?

The best way on Mac and other Unix-based systems is with ps(1).

DigitalRoss
  • 143,651
  • 25
  • 248
  • 329
  • Thank you very much for your answer! If I understand you correctly, then only those pages having code of the needed functions are actually loaded. Does that mean, for embedded systems, that if I had only small RAM but large libraries (which may e.g. not all fit into the RAM), that I still could run the exectuable (in Linux) ? – Oliver Apr 30 '11 at 06:10
  • Yes that is what it means (remember to approve the answer if you think it answers your question). Although you really should plan for the worst case and make sure you don't run out of memory no matter which functions you call in your dependencies. – Joseph Lisee Apr 30 '11 at 07:36
  • @Oliver: maybe. Because page granularity is coarse, a dynamic image will load lots of bits that you never use, and the library itself probably touches way more functionality than it should even for "hello, world". It's a difficult tradeoff, static linking uses less memory for the one application, but it means the libraries are not shared with other separately linked programs and system daemons. I would static link any library used only once and dynamically load those shared by essential services. Note that you *can* run multiple copies of the same static image without a memory penalty. – DigitalRoss May 03 '11 at 19:23
2

When you link statically, only the functions that are (potentially) called get linked into the executable -- but at run-time, the data from the executable file will be read into memory by demand paging.

When the process is created, addresses are assigned to all the code in the executable and shared libraries for that process, but the code/data from the file isn't necessarily read into physical memory at that time. When you attempt to access an address that's not currently in physical memory, it'll trigger a not-present exception. The OS virtual memory manager will react to that by reading the page from the file into physical memory, than letting the access proceed.

The loading is done on a page-by-page basis, which usually means blocks of 4 or 8 kilobytes at a time (e.g., x86 uses 4K pages, Alpha used 8k). x86 does also have an ability to create larger (4 megabyte) pages, but those aren't (at least usually) used for normal code -- they're for mapping big blocks of memory that remain mapped (semi-)permanently, such as the "window" of memory on a typical graphics card that's also mapped so it's directly accessible by he CPU.

Most loaders have some optimizations, so (for example) they'll attempt to read bigger blocks of memory when the program starts up initially. This lets it start faster than if there was an interrupt and separate read for each page of code as it's accessed. The exact details of that optimization vary between OSes (and often even versions of the same OS).

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111