9

I'm studying process execution on Linux 2.6.32 on a 64-bit box. While studying the outputs of /proc/$PID/maps, I observed one thing:

$ cat /proc/2203/maps | head -1
00400000-004d9000 r-xp 00000000 08:02 1050631              /bin/bash

$ cat /proc/27032/maps | head -1
00400000-00404000 r-xp 00000000 08:02 771580               /sbin/getty

It seems that the maps file for all the programs shows that the executable code for each program is loaded in a block of memory beginning at 0x00400000.

I understand that these are virtual addresses. However, I don't get how these addresses can be the same for multiple concurrently running processes. What is the reason behind using a common start address for loading all processes, and how does the OS distinguish between the virtual load point of one process from another?

Edit:

From my understanding of address space virtualization using paging, I thought part of the virtual address was used to look up the physical address of a memory block (a frame) by using it to index one or more page tables. Consider this case. The address looks 32-bit (this is another thing that baffles me -- why are the program addresses 32-bit, but the addresses of the loaded libraries are 64-bit?). Breaking the address into ten, ten, and twelve bits corresponding to the page directory entry, the page table entry, and the page offset respectively, shouldn't 0x00400000 always mean "page directory entry 1, page table entry 0, offset 0", no matter what program performs the address translation?

One way I can see how this can be done is if the OS modified the page directory entry #1 to point to the page table corresponding to the program each time a task switch is performed. If that's the case, it sounds like a lot of added complexity -- given that program code is position-independent, won't it be easier to just load the program at an arbitrary virtual address and just go from there?

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
susmits
  • 2,210
  • 2
  • 23
  • 27

2 Answers2

8

The answer is that each process has its own page tables. They are switched when processes are switched.

More information at http://www.informit.com/articles/article.aspx?p=101760&seqNum=3.

The kernel switches the page tables when a context switch happens. On operating systems where the kernel is mapped into every process, the kernel pages can remain. On the other hand, operating systems (32bit) which provide 4GiB to user-processes have to do a context switch when going into the kernel (a syscall) as well.

While virtual addressing doesn't require different processes to have different page tables, (the dependency goes the other way), I can't think of any current operating systems that don't give each process its own page tables.

Douglas Leeder
  • 52,368
  • 9
  • 94
  • 137
5

Q: I understand that these are virtual addresses.

A: Good...

Q: However, I don't get how these addresses can be the same for multiple concurrently running processes.

A: I thought you just said you understood "virtual addresses" ;)?

Q: What is the reason behind using a common start address for loading all processes?

A: Remember, it's a virtual address - not a physical address. Why not have some standard start address?

And remember - you don't want to make the start address "0" - there are a lot of specific virtual addresses (especially those under 640K) that a process might wish to map as though it were a physical address.

Here's a good article that touches on a few of these issues. Including "e_entry":

How main() is executed on Linux

jopasserat
  • 5,721
  • 4
  • 31
  • 50
paulsm4
  • 114,292
  • 17
  • 138
  • 190
  • 1
    Paul is right. Your question is unintentionally humorous! Suppose that I said, "I understand that people have surnames, but I don't get how multiple users Paul Smith and Paul Jones can be concurrently answering questions on Stackoverflow." Your question is of this kind. – thb Jun 01 '12 at 05:42
  • Apparently I was unable to quite phrase my question properly; I have edited the question be more specific. – susmits Jun 01 '12 at 12:52
  • Here's a very good article on Linux memory mapping: http://tldp.org/LDP/tlk/mm/memory.html. – paulsm4 Jun 01 '12 at 16:16
  • 1
    It's also worth noting that, fundamentally, "virtual memory" is a "hardware thing". Yes, the OS manages the page tables. But the actual mapping of a virtual address to a physical address is ultimately done by the system's VMM hardware, *not* by the OS. Hence, from the software's perspective, a "virtual address" (which, in theory, can map to a *different* physical address at *any time*) is as "real as it gets". – paulsm4 Jun 01 '12 at 16:16