In Linux the system call fork
is said to use copy-on-write (COW) mechanism thus avoiding to actually duplicate memory if it's not really needed.
However, I've never seen an actual clear evidence of COW in action. I thought thus I build my own.
For this purpose I run a program that calculates the square of a matrix with 40,000
rows and columns, with each entry being a double (8 Bytes). This program forks into 8 processes in order to do the squaring. The actual values are populated at random. Both the initial matrix and the result are allocated using the system call mmap
as rw and shareable.
I was hoping that I could see COW in action using top
to track memory usage. What I get is the view below. This at best is an example "by contradiction": fork can't be possibly actually cloning and duplicating the memory of the initial matrix at the time of call, as the system does not have 95GiB of RAM, but only 16GiB.
(Incidentally this shows that it doesn't make much sense to add the values for different processes in any of the memory columns of top. I wasn't aware of this, which I'm afraid explains in part me failing to clearly see how the copy-on-write is taking place)
It doesn't seem such compelling an evidence to show COW in action. From an educational point of view, this is a weak example, as all processes "show" 11.9GiB use. It's not a compelling, clear example one could wish for. Instead you need to argue along the lines "ignores this, ignore that other value, then imagine...and you get were we wanted". Better would be some tool showing the actual, portion of independent physical memory each process is using.
Here independent means: if you add the memory usage you see for each process you get the actual, total physical memory the whole program is using.
Being top
the wrong tool for that, what could one use instead?
EDIT: For example, would any of these tools help? If so, how to use them to get that info? valgrind, smem, /proc/pid/smap...
EDIT2: Digging further as hinted here , the best chance I've found so far is through smaps
(or maybe pmap
) as
( echo "Parent:" ; pid=266694 ; cat /proc/$pid/smaps | grep -A9 "/dev/zero" ; echo " "; for ch in
cat /proc/$pid/task/$pid/children
; do echo -e "\n###\nChild: $ch"; cat /proc/$ch/smaps| grep -A9 "/dev/zero"; done ) > out
where the parent process' id is 266694. The part of that output potentially useful here seems the following
Parent: 7fa5d9963000-7fa8d486b000 rw-s 00000000 00:01 18271562 /dev/zero (deleted)
Size: 12500000 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Rss: 35512 kB
Pss: 35512 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 35512 kB
-- 7fa8d486b000-7fabcf773000 rw-s 00000000 00:01 18271561 /dev/zero (deleted)
Size: 12500000 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Rss: 12500000 kB
Pss: 1562500 kB
Shared_Clean: 0 kB
Shared_Dirty: 12500000 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
\### Child: 266706 7fa5d9963000-7fa8d486b000 rw-s 00000000 00:01 18271562 /dev/zero (deleted)
Size: 12500000 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Rss: 35872 kB
Pss: 35872 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 35872 kB
-- 7fa8d486b000-7fabcf773000 rw-s 00000000 00:01 18271561 /dev/zero (deleted)
Size: 12500000 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Rss: 12500000 kB
Pss: 1562500 kB
Shared_Clean: 0 kB
Shared_Dirty: 12500000 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
and the rest of the children showing roughly the same values. I'm not sure yet how to interpret this correctly.
However, I'm afraid my initial question has more to do with (*nix) memory management and the tools available for inspecting it -top
and ps
clearly useless here (see (ext)). And this is a completely different topic I know but an eps>0. Maybe I'd change the title or remove the question altogether...