Paging - What is exactly "inside" of a Page?

Question

From my understanding of how virtual address translation happens (assuming a 32 bit virtual address space, as in the x86 architecture):

bits 31:22 of the virtual address indicate the proper Page Directory Entry (i.e. aligned physical address of the Page Table) to access
similarly, bits 21:12 of the virtual address indicate the proper Page Table Entry (i.e. aligned physical address of the Page itself) to access
lastly, bits 11:0 indicate the proper offset into the Page, and are merely appended onto the end of the final physical address to access

As for the physical address

bits 31:12 are the physical base address (aligned) of the Page (found from the PTE)
bits 11:0 are the same as they are in the virtual address

Visually, this is what happens

I know that a Page is a chunk of virtual memory. But conceptually, I have a hard time visualizing what is actually "inside" of a Page. Ie, if I were to index into a "Page entry" (if such a thing even makes sense), what would I get in return?

This post seems to refer this value as a "desired byte." What exactly is the "desired byte"? Am I overthinking the functionality of Pages?

'This post seems to refer this value as a "desired byte."' -> I can't find that phrase anywhere in that post. Where did you get that term from? — Jongware, Jan 10 '18 at 11:03

score 1 · Answer 1 · answered Jan 12 '18 at 20:48

From reading your question, it sounds like your misunderstanding is because you do not understand the format of a page table entry.

The memory management unit (MMU) of the CPU divides the physical memory into PAGE FRAMES of some fixed size (typically 512K to 1MB).

The operating system manages PAGES of memory.A page must have the same size as the page frame. User mode processes only see pages; not page frames.

The operating system maintains sets of PAGE TABLES that provide the mapping between the pages and page frames.

In a logical memory system, the bits within an address consists of two bit fields. One bit field identifies the page and the other specifies the byte offset into the page.

When a process accesses an address, the MMU divides it into the two bit field. It then uses the page identifier to look up the what page frame the page is mapped to in the page tables.

. Ie, if I were to index into a "Page entry" (if such a thing even makes sense), what would I get in return?

The page entry (or entry in the page table) specifies the number of the physical page frame.

[This is the part it sounds like you are missing.]

In your example, you discuss a multi-level page table, but for simplicity, let's assume there is no page directory, and just a page table.

In a 32-bit system, the page entry will typically be 32-bit and 64-bits on a 64-bit system. The format of the page entry varies among system but it will likely have bit fields that define:

The index of the page frame mapped to.
Bits indicating if the entry is valid.
A bit the indicates if the corresponding page has been written to.
Bits the specify the protection for the page.

In your example you have omitted the format of the page table entry.

So once you have the entry, the next step is to get the page frame from it. In your example, this is 4096 bytes of data.

The MMU could either just use the page frame index to identify the page. Or it could multiply that value by the page size to get the byte that starts the page.

To get the specific byte within the 4096, the MMU uses the offset (bits 0:11 in your example)

The MMU does all this behind the scenes so the process never sees it. One of the chief jobs of the operating system is to maintain the page tables and the entries within them.

Your "typically 512K to 1MB" claim seems way off. Most virtual-memory systems use 4kiB pages / page-frames. e.g. x86. [MIPS supports a choice of 4k / 16k / 64k](https://www.linux-mips.org/wiki/Page_size). Some systems have "hugepages" for large mappings, like x86-64's 2MiB or 1GiB hugepages. https://unix.stackexchange.com/questions/128213/how-is-page-size-determined-in-virtual-address-space says that "normal" page size is 4kiB on pretty much everything, but AArch64 allows page tables that use 16kiB granularity instead of 4k. — Peter Cordes, Jan 13 '18 at 05:13
Apparently, you are limited in the range of systems you have worked on. — user3344003, Jan 13 '18 at 22:33
Yes, I've only really worked with x86 (and some SPARC machines at university), and read about a few other mainstream architectures, especially ARM, MIPS, and PowerPC. I'm not saying there are no systems where 512k or 1M pages are the smallest option, I'm saying that's not "typical". You need enough examples to outweigh most of the architectures Linux runs on... — Peter Cordes, Jan 13 '18 at 23:11
I'm with Peter here. Almost no matter how you weigh it, the vast majority of mainstream architectures (for a wide definition of "mainstream") have their minimum/default page size less than 512K, usually much less. You should provide some counter-examples if you know otherwise. — BeeOnRope, Jan 13 '18 at 23:32
Have you not seen a VAX? While a bit old, there are all over the place controlling critical functions using a 512 Byte page. — user3344003, Jan 14 '18 at 23:32

score 0 · Answer 2 · answered Jan 10 '18 at 04:27

You can think of each page as a 4096 byte "array". (On x86 with 4k pages). It's not an "array" in the high-level-language sense of the word (unless you happen to have a page-aligned array of bytes in your program), but it is a linear collection of bytes that you can index with an offset.

A one-byte load from any specific address has an offset within a specific page, which determines which of those 4096 bytes should be loaded. The low 12 bits of the address determine this offset. (i.e. the "page offset").

Note that 2^12 = 4096, that's why it's the low 12 bits that represent the offset within the page in an address.

Further reading: What Every Programmer Should Know About Memory?

Paging - What is exactly "inside" of a Page?

2 Answers2