26

I have been hearing the term address space often in microprocessors and microcontrollers Paradigm. I understand that an address is used to refer to a particular block of memory in the physical memory(Primary).

If I'm right and address space is the super set of all such addresses. Right?

By using virtual memory/paging we are extending the address space using secondary storage.

In this paradigm what exactly is a page table, page table entry and page directory? I understand that first p.memory is segmented logically and these segments are divided into pages. So what exactly is a page table? A table containing Pages? And what is a page directory a super table of Page Tables?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
techno
  • 6,100
  • 16
  • 86
  • 192
  • See [osdev.org](http://wiki.osdev.org/Paging) or the official intel manual. – Jester Apr 29 '15 at 15:31
  • You've already got your answer, but I would like to enhance the coverage of the question a little bit more. Check out the following link to see a different approach to the optimization of page tables [Operating Systems: Three Easy Pieces](http://pages.cs.wisc.edu/~remzi/OSTEP/vm-smalltables.pdf) – Caglayan DOKME Jan 17 '21 at 08:55

1 Answers1

69

In the x86 architecture, page directories and page tables together provide the mapping between virtual addresses (memory addresses used by applications) and physical addresses (actual locations in the physical memory hardware).

A page is simply a contiguous chunk of memory. x86 (32-bit) supports 3 sizes of pages: 4MB, 2MB, and 4KB, with the latter being the most commonly used in mainstream operating systems. A page table is an array of 1024 * 32-bit entries (conveniently fitting into a single 4KB page). Each entry points to the physical address of a page. Because a single page table is not able to represent the entire address space on its own (1024 entries * 4KB = only 22-bits of address space), we require a second level page table: a page directory. A page directory also consists of 1024 * 32-bit entries (again fitting into a single page), each pointing to a page table. We can see that now 1024 * 1024 * 4KB = 32-bits and with this 3-level structure we are able to map the entire 4GB virtual address space.

When the CPU is asked to access a virtual address, it uses the 10 highest order bits (31:22) to index into the page directory table (the base address of which is stored in a special register). The next 10 highest order bits (21:12) are used to index into the page table pointed to by the page directory entry. The lowest 12 order bits (11:0) are finally used to index a byte in the page pointed to by the page table entry.

In other systems there may be more or fewer levels of page table required, depending on the size of the virtual address space and the page sizes supported. For example, x86 with 4MB pages only needs a single page directory. In 64-bit mode with 4KB pages, a 4-level system is used: a page mapping level 4 table contains entries that point to one of many page directories.

The Intel Architectures Developer's Manual has much more information about the topic, particularly in chapters 3 and 4.

peterdn
  • 2,386
  • 1
  • 23
  • 24
  • 2
    But doesn't it mean that when 2 different programs try to access the virtual address 0x0041FF10 , they will get the same physical address? The CPU takes the same number of bits for indexing out of the same virtual address, which translates into equal indexes.. – W2a Nov 19 '16 at 15:08
  • 7
    The OS will typically maintain separate page directories and tables for each process, providing different mappings from virtual to physical addresses. Recall that the base address of the current page directory is stored in a special register. The value in this register is changed by the OS during a context switch to another process. Therefore although the indexes are equal for process A and process B, they index into different page directories. – peterdn Nov 20 '16 at 01:23
  • 2
    On the Intel x86, that is controlled via the CR3 register. – Per Lundberg Nov 05 '17 at 15:33
  • 1
    > _A page is simply a contiguous chunk of memory_ I think you should explicit denote **virtual/logical memory**, because **physical memory** used by process is non-contiguous. – Chen Li Jan 01 '19 at 06:28
  • Why does the table/directory at whatever level has to be of the same size of a page instead of using just one big table? What's so convenient about it? – huggie Feb 13 '19 at 00:48
  • 1
    @Nikos that's right, normally one page directory for each running process, in order to keep them isolated. Note that they might share some page tables or pages for shared memory purposes. Also note the page directory doesn't have to be full (not all 1024 child page tables need to exist, only the ones that map memory the process actually uses). – peterdn Aug 27 '19 at 08:22
  • @Nikos without seeing it, I suspect that statement is meaning 4 _levels of page table_, not 4 individual page tables. I explain above in my answer's penultimate paragraph that in 64-bit systems where virtual addresses are larger than 32 bits long, 3 levels aren't enough to address individual 4KB pages, and so a 4-level structure is needed instead. In this case, each running process would have its own individual 4-level data structure. – peterdn Aug 27 '19 at 15:54
  • @peterdn when we consider the amount of space a single page table can reference (1024 entries * 4KB = only 22-bits of address space) why is it that we have 22-bits instead of 22-bytes? It seems that each of those 1024 entries references 4 Kilobytes worth of memory, thus leading to (2^10 entries) * (2^12 bytes / entry) = 2^22 bytes – jeffhu May 21 '20 at 17:47
  • 2
    @jeffhu the 22 bits refers to the size of the _address space_ i.e. an address would be a 22-bit value (not a 22-byte value). Then as you have correctly reasoned, this provides a total of 2^22 possible values for that address. – peterdn May 25 '20 at 12:42