1

Based on the comments in this C tutorial page which I'm using to review C after some time not working with it, I expect that a simple program compiled with a shared version of library will appear to use less memory than a version of that program using the static version of the library.

Here is a simple example program, which requests user input just so the program will be idle while I go use top and ps to inspect it. The goal is to compile the program with just -lm (linking libmath), then later compile it with -lm --static. When I run each program, I should see the static option taking up less memory associated with its running process.

/* lib_a_vs_so.c */
#include <math.h>
#include <stdio.h>

void main()
{
    double x = sin(3.14);
    int user_input;

    printf("Enter a number: ");
    scanf("%d", &user_input);
    printf("You entered %d and sin(3.14) is %.2f\n", user_input, x);
}

The steps to compile two different versions:

c99 -o so_version lib_a_vs_so.c -lm
c99 -o a_version lib_a_vs_so.c -lm --static

and the output from top on my system (gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.1)) when I run both programs.

top - 19:06:51 up 10:20,  5 users,  load average: 0.06, 0.24, 0.25
Tasks:   2 total,   0 running,   2 sleeping,   0 stopped,   0 zombie
%Cpu(s):  2.4 us,  0.8 sy,  0.0 ni, 96.4 id,  0.3 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:  16327932 total,  7522656 used,  8805276 free,   435692 buffers
KiB Swap:        0 total,        0 used,        0 free.  2836848 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                    
 7010 ely       20   0    4192    356    276 S   0.0  0.0   0:00.00 so_version                                                                 
 7025 ely       20   0    1080    264    212 S   0.0  0.0   0:00.00 a_version 

Why does so_version appear to use more memory?

ely
  • 74,674
  • 34
  • 147
  • 228
  • Very simply speaking, if you use a static library, you get only out of it what you need. If you use a shared library, you get the whole thing, *but* you typically share it with any other processes(es) on the system who happen to be using the same library. Whether that looks like more or less memory will depend to some extent on how many other processes are using it, and how the accounting for shared memory works. – Steve Summit Jan 23 '18 at 00:25
  • Ah, that makes sense. I also see now that the disk footprint of the two executables does match with intuition. 868k for `a_version` but only 20k for `so_version`. Does this disk size reflect the entire libmath.a, or somehow just a part related to `sin` plus macros/whatever? – ely Jan 23 '18 at 00:43
  • Your program does not only use libm, it also uses libc. – Deduplicator Jan 23 '18 at 01:04
  • @Deduplicator This confuses me more. When I try with `c99 -o a_version lib_a_vs_so.c /usr/lib/x86_64-linux-gnu/libm.a`, now both executables have the same size on disk, of 20K (the original size of the .so version). Why does use of dynamic libc coupled with static libm.a result in something the same size as if both libraries are dynamic? – ely Jan 23 '18 at 01:15
  • 1
    Well, your compiler *knows* `sin()`, so it can execute it at compile-time, or inline it, or whatever. Meaning no symbol from libm is actually used, thus all references to it are discarded... – Deduplicator Jan 23 '18 at 01:28
  • Gotcha. Thank you for clarifying that. – ely Jan 23 '18 at 01:56

1 Answers1

1

Static Linking

When you statically link a library into a binary and run the binary, the operating system will only load the stuff it actually needs to use (when the execution of an instruction attempts to reference a location in virtual memory that is not loaded into physical memory, a page fault occurs and the OS loads the page and continues -- helpful).[1]

Dynamically Loading

When dynamically loading a library, the operating system uses mmap to load the shared object.[2] mmap naturally implements demand paging as well, so the entire object is not loaded into physical memory, however it is shared between multiple processes, so parts of it may be loaded into physical memory already.

Answer

I think there are two possibilities for the behavior your question addresses (which are not necessarily mutually exclusive):

  1. (more likely, IMO) When top calculates the memory usage, it counts the shared object space as being used by the process, which includes the entire shared object space that is physically loaded, even if this particular program didn't need to use it. This will make the memory used by this process appear larger than it's statically linked counterpart, which has less of it's statically linked library physically loaded.
  2. (less likely, IMO) The address space mapped for the shared object is guaranteed to contain the entire library, since at any time a new program could pop up that needs to use any part of the shared library. However, in static linking, the linker can potentially skip compilation units if they are never referenced by the rest of the program being linked (source). That could cause the size of the static library to be reduced during linking (and thus, smaller memory usage in runtime).

[1] This applies to all major operating systems (Linux-based, BSD-based (macOS, iOS), and Windows). I'm sure there is an OS out there that loads entire binaries into memory before running them, but that's outside the scope of this answer.

[2] This detail is something I'm aware that Linux and FreeBSD does. Windows probably does it with DLLs, but I don't know for sure.

Greg Schmit
  • 4,275
  • 2
  • 21
  • 36