I'm studying process execution on Linux 2.6.32 on a 64-bit box. While studying the outputs of /proc/$PID/maps
, I observed one thing:
$ cat /proc/2203/maps | head -1
00400000-004d9000 r-xp 00000000 08:02 1050631 /bin/bash
$ cat /proc/27032/maps | head -1
00400000-00404000 r-xp 00000000 08:02 771580 /sbin/getty
It seems that the maps
file for all the programs shows that the executable code for each program is loaded in a block of memory beginning at 0x00400000
.
I understand that these are virtual addresses. However, I don't get how these addresses can be the same for multiple concurrently running processes. What is the reason behind using a common start address for loading all processes, and how does the OS distinguish between the virtual load point of one process from another?
Edit:
From my understanding of address space virtualization using paging, I thought part of the virtual address was used to look up the physical address of a memory block (a frame) by using it to index one or more page tables. Consider this case. The address looks 32-bit (this is another thing that baffles me -- why are the program addresses 32-bit, but the addresses of the loaded libraries are 64-bit?). Breaking the address into ten, ten, and twelve bits corresponding to the page directory entry, the page table entry, and the page offset respectively, shouldn't 0x00400000
always mean "page directory entry 1, page table entry 0, offset 0", no matter what program performs the address translation?
One way I can see how this can be done is if the OS modified the page directory entry #1 to point to the page table corresponding to the program each time a task switch is performed. If that's the case, it sounds like a lot of added complexity -- given that program code is position-independent, won't it be easier to just load the program at an arbitrary virtual address and just go from there?