1

I was trying to find the virtual set size and resident set size of a c program. I wrote a kernel module to traverse the vm_areas and calculated vss and rss. I also wrote one c program to validate the changes in vss and rss.

// sample test program
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#define N 10000000
int main() {

    // setup...
    int *arg1 = malloc(5*sizeof(int));
    char *arg2 = malloc(sizeof(char)*1024);
    pid_t _pid = getpid();
    int rss , prev_rss , vss, prev_vss;
    printf("pid of this process = %d\n",_pid);
    //...


    // first observatrion
    arg1[0] = (int)(_pid);
    long res = syscall(333,arg1,&arg2);
    vss = prev_vss = arg1[1]; // agr1[1] stores the vss from the kernel module
    rss = prev_rss = arg1[2]; // agr1[2] stores the rss from the kernel module
    printf("vss = %d rss = %d\n",vss,rss);


    unsigned int *ptr = malloc(1<<21); // 2 MB 
    printf("ptr = %p\n",ptr);


    // second observatrion
    arg1[0] = (int)(_pid);  
    res = syscall(333,arg1,&arg2);
    vss = arg1[1];
    rss = arg1[2];
    printf("vss = %d rss = %d\n",vss,rss);
    if(vss - prev_vss > 0) {
        printf("chnage in vss = %d\n", vss - prev_vss);
    }
    if(rss - prev_rss > 0) {
        printf("chnage in rss = %d\n", rss - prev_rss);
    }   

    prev_vss = vss;
    prev_rss = rss;

    // ...

    return 0;
}

The output of the above program:

pid of this process = 12964
vss = 4332 rss = 1308
ptr = 0x7f4077464010
vss = 6384 rss = 1312
chnage in vss = 2052
chnage in rss = 4

Here are the dmesg output : First observation:

[11374.065527]  1 = [0000000000400000-0000000000401000]      RSS=4KB     sample
[11374.065529]  2 = [0000000000600000-0000000000601000]      RSS=4KB     sample
[11374.065530]  3 = [0000000000601000-0000000000602000]      RSS=4KB     sample
[11374.065532]  4 = [0000000000c94000-0000000000cb5000]      RSS=4KB     
[11374.065539]  5 = [00007f4077665000-00007f407781f000]      RSS=1064KB      libc-2.19.so
[11374.065546]  6 = [00007f407781f000-00007f4077a1f000]      RSS=0KB     libc-2.19.so
[11374.065547]  7 = [00007f4077a1f000-00007f4077a23000]      RSS=16KB    libc-2.19.so
[11374.065549]  8 = [00007f4077a23000-00007f4077a25000]      RSS=8KB     libc-2.19.so
[11374.065551]  9 = [00007f4077a25000-00007f4077a2a000]      RSS=16KB    
[11374.065553]  10 = [00007f4077a2a000-00007f4077a4d000]     RSS=140KB   ld-2.19.so
[11374.065554]  11 = [00007f4077c33000-00007f4077c36000]     RSS=12KB    
[11374.065556]  12 = [00007f4077c49000-00007f4077c4c000]     RSS=12KB    
[11374.065557]  13 = [00007f4077c4c000-00007f4077c4d000]     RSS=4KB     ld-2.19.so
[11374.065559]  14 = [00007f4077c4d000-00007f4077c4e000]     RSS=4KB     ld-2.19.so
[11374.065561]  15 = [00007f4077c4e000-00007f4077c4f000]     RSS=4KB     
[11374.065563]  16 = [00007ffcdf974000-00007ffcdf995000]     RSS=8KB     
[11374.065565]  17 = [00007ffcdf9c3000-00007ffcdf9c6000]     RSS=0KB     
[11374.065566]  18 = [00007ffcdf9c6000-00007ffcdf9c8000]     RSS=4KB

Second observation:

[11374.065655]  1 = [0000000000400000-0000000000401000]      RSS=4KB     sample
[11374.065657]  2 = [0000000000600000-0000000000601000]      RSS=4KB     sample
[11374.065658]  3 = [0000000000601000-0000000000602000]      RSS=4KB     sample
[11374.065660]  4 = [0000000000c94000-0000000000cb5000]      RSS=4KB     
[11374.065667]  5 = [00007f4077464000-00007f4077665000]      RSS=4KB     
[11374.065673]  6 = [00007f4077665000-00007f407781f000]      RSS=1064KB      libc-2.19.so
[11374.065679]  7 = [00007f407781f000-00007f4077a1f000]      RSS=0KB     libc-2.19.so
[11374.065681]  8 = [00007f4077a1f000-00007f4077a23000]      RSS=16KB    libc-2.19.so
[11374.065683]  9 = [00007f4077a23000-00007f4077a25000]      RSS=8KB     libc-2.19.so
[11374.065685]  10 = [00007f4077a25000-00007f4077a2a000]     RSS=16KB    
[11374.065687]  11 = [00007f4077a2a000-00007f4077a4d000]     RSS=140KB   ld-2.19.so
[11374.065688]  12 = [00007f4077c33000-00007f4077c36000]     RSS=12KB    
[11374.065690]  13 = [00007f4077c49000-00007f4077c4c000]     RSS=12KB    
[11374.065691]  14 = [00007f4077c4c000-00007f4077c4d000]     RSS=4KB     ld-2.19.so
[11374.065693]  15 = [00007f4077c4d000-00007f4077c4e000]     RSS=4KB     ld-2.19.so
[11374.065695]  16 = [00007f4077c4e000-00007f4077c4f000]     RSS=4KB     
[11374.065697]  17 = [00007ffcdf974000-00007ffcdf995000]     RSS=8KB     
[11374.065699]  18 = [00007ffcdf9c3000-00007ffcdf9c6000]     RSS=0KB     
[11374.065701]  19 = [00007ffcdf9c6000-00007ffcdf9c8000]     RSS=4KB

The virtual address of the ptr was found to be : ptr = 0x7f4077464010 which corresponds to the 5th vm_area in the second obervation.

[00007f4077464000-00007f4077665000]      VSS=2052KB // shown from the VSS outputs

My quesitons are :

  1. Why there is a difference of between desired malloc size (which was of 2048 KB) and the vss output for the 5th vm_area (2052 KB)?

  2. We have not accessed the memory region pointed by ptr yet. So then why one physical page s allocated as shown in the rss result of the seocnd observation for the 5th vm_area? ( is it possibly because of the new vm_area_struct ?)

Thank You !

Debashish
  • 1,155
  • 19
  • 34
  • 2
    `Why there is a difference of between desired malloc size (which was of 2048 KB) and the vss output for the 5th vm_area (2052 KB)?` As usual with allocators, they use more size than requested by a user: the memory outside of the requested size is used for allocator's needs. – Tsyvarev Jan 31 '18 at 12:51

1 Answers1

0

malloc(xxx) does not exactly allocate xxx size of memory. malloc is not a system call, but a library function.

In general, malloc has following steps.

  1. extend the heap space via brk (if it needs)
  2. do mmap to map virtual address with physical address
  3. allocate some metadata (for managing heap space, usually linked list).

In step 3, the one page would be accessed. it means one physical page is accessed and results in increasing the RSS size by 4KB (a page size).

Joonsung Kim
  • 246
  • 1
  • 3
  • 15
  • 1
    Often the heap space (that is the virtual address space) is extended with `mmap`. `brk` or `sbrk` are obsolete system calls, and many `malloc` implementations don't use them anymore. – Basile Starynkevitch Jan 31 '18 at 13:00
  • @BasileStarynkevitch Is there any reason why they don't use brk or sbrk anymore? performance? – Joonsung Kim Jan 31 '18 at 13:08
  • `mmap` is more general, and more multi-thread friendly. And a single `malloc` call won't do both `mmap` & `sbrk`, but one of them. BTW, `malloc` implementations are generally free software (e.g. because GNU [glibc](https://www.gnu.org/software/libc/) or [musl-libc](http://musl-libc.org/) are free software), so you can study their source code. At last, `mmap` is *not* mapping virtual addresses to physical addresses, but growing the [virtual address space](https://en.wikipedia.org/wiki/Virtual_address_space) of your process – Basile Starynkevitch Jan 31 '18 at 13:16
  • The mapping from virtual addresses to physical address is done by the kernel, which provides the "virtual address space" abstraction (or illusion) to user-land processes. – Basile Starynkevitch Jan 31 '18 at 13:20
  • @BasileStarynkevitch the recent malloc uses both mmap and sbrk (sbrk is not an obsolete system call). The two system calls are used in different situation. (refer to [understanding_malloc](https://sploitfun.wordpress.com/2015/02/10/understanding-glibc-malloc/)). – Joonsung Kim Jan 31 '18 at 13:41
  • "mmap is not mapping virtual addresses to physical addresses, but growing the virtual address space of your process" => No, mmap syscall maps virtual addresses with physical addresses. It does **both** mapping and growing virtual address space. – Joonsung Kim Jan 31 '18 at 13:43
  • No, the kernel does the mapping (and that mapping is changed with successful `mmap` or `sbrk`). The user-code see only virtual addresses. And you can implement `malloc` *without* `sbrk`, only with `mmap` (and `munmap`, often called by `free` of *large* blocks). – Basile Starynkevitch Jan 31 '18 at 13:44
  • @BasileStarynkevitch First the mapping issue. I think there's some terminology difference between us. mmap system call allocates vm_area struct which represents the virtual region (I called this procedure as **mapping**). Surely, the real page table mapping is established when a page fault occurs. The page fault handler traverses struct mm (contains multiple vm_area) to update page table. (I think you say this page table update as **mapping**). – Joonsung Kim Jan 31 '18 at 13:55
  • The application's process (and code) don't see `vm_area`. It is a kernel thing. The virtual address space is visible, e.g. thru `/proc/self/maps` – Basile Starynkevitch Jan 31 '18 at 13:56
  • Second, using sbrk and mmap issue. The brk is faster than mmap for allocating small region (refer to [mmap_vs_sbrk](https://stackoverflow.com/questions/5517601/mmap-vs-sbrk-performance-comparison/5517984)). The recent malloc uses both system calls, not uses mmap only. – Joonsung Kim Jan 31 '18 at 13:56
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/164271/discussion-between-basile-starynkevitch-and-joonsung-kim). – Basile Starynkevitch Jan 31 '18 at 13:57