3

I am trying to understand why is Page Size specified as part of an ISA.

More specifically, I am looking for details where any of the hardware modules (MMU, TLB) (apart from the Operating System) use the Page Size information to provide a certain functionality.

Please let me know the reasons Page Size has to be part of the ISA instead of just being decided by the OS.

Thanks.

Uchia Itachi
  • 5,287
  • 2
  • 23
  • 26

2 Answers2

1

The TLB hardware has to know the page size to figure out whether a translation applies to an address or not. e.g. given a translation, does an address 2500 bytes above it use that translation or not?

Or to put it another way, the TLB has to know which address bits are part of the page offset (within a page), and which bits need translating from virtual to physical.

Also, on architectures with HW page walk, the whole page table format is part of the ISA, and the typical design uses the virtual page number as an index to find the right entry (e.g. x86-64's 4-level page tables). Not a linear or binary search through the page table to find an entry that contains the virtual address being searched for. Normally this same design is used for page tables walked by software, AFAIK.


It is possible to build a TLB where each entry has a mask to control how many address bits it matches. i.e. where a single TLB can have entries for pages of multiples sizes. This only works if pages have power-of-2 sizes and are naturally aligned (i.e. the start address of a page is always some multiple of its size, so zeroing the low bits of an address inside a page gives you the page-start address).

You could potentially use this with an extent-based page-table format, where you have one entry for each contiguous mapping instead of one entry for each page.

Page-walks would probably be more costly, having to check more entries for more mappings, but the same number of TLB entries could cover more address space.

In many cases OSes want to be able to easily unmap or even page out unused pages, and this conflicts with using huge pages that cover a mix of hot and cold data or especially code. (But normal fixed-size hugepages have this problem, too, so x86-64's 2M and 1G hugepages aren't always a win vs. standard 4k pages.)

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • Just a hypothetical scenario, assume there is no hardware TLB and no hw page walk, but a MMU to assist the translations, then there is no need for Page Size to be part of ISA. Correct ? Because the OS manages the page table and the page walk, and the MMU just translates the VPN part excluding the offset. Or do we need the Page Size information elsewhere in the architecture ? – Uchia Itachi Nov 02 '17 at 22:47
  • @UchiaItachi: Nothing else comes to mind. Translation to physical happens early, and after that the rest of the cache/memory system uses physical addresses. A design without a page-based TLB could use base/limit translation entries, software managed. You could use it with a fixed size for every translation, but the HW would probably only provide a few translation slots so you'd be much better to use it like extents: a translation for the whole range that you're mapping. – Peter Cordes Nov 03 '17 at 01:42
  • 1
    Without any HW translation, all an MMU could do is invoke software translation on *every* memory access. For most workloads on a normal CPU, virtual memory costs a couple % performance. Invoking a software miss-handler on every memory access would probably make that penalty at least 100%, maybe 1000%. Oh, that would have to include code-fetch, so maybe a factor of 10 is an underestimate. More plausible is if you could supply firmware for a miss-handler unit (instead of having the whole CPU take an exception), or if the page size was simply configurable. – Peter Cordes Nov 03 '17 at 01:46
0

Page size isn't a part of the ISA (what a compiler would normally emit) for x86_64. The instruction set architecture for x86_64 is formally known as Intel® 64 Architecture, and it is briefly described in section 2.2.10 (volume 1) of the Intel® 64 and IA-32 Architectures Software Developer’s Manual. It describes what an application program can see and do. There is something similar for ARMv8.

Instead, page size is left to the OS, and it isn't a part of the ISA. This is because page sizes can vary amongst implementations and can vary according to mode settings (4K/2M/4M/1G). x86_64 implementations present something like an ISA to the OS which Intel refers to as the system programming level (what an OS would use). That's described in Chapter 13 of volume 2 of Intel's Software Developer's Manual.

That level describes page sizes and modes. But a 'correct' application program should run with different page sizes on different systems in different page size modes.

Olsonist
  • 2,051
  • 1
  • 20
  • 35
  • 1
    Why are you limiting your definition of "ISA" to "non-privileged instructions"? That's not the normal definition of ISA. I think most people agree that an ISA includes CPU behaviour that a kernel relies on, including how HW page walks work, and how big a chunk is affected by `invlpg`. i.e. x86's control registers are part of the ISA. And BTW, *Intel® 64 Architecture* is Intel's name for their version of x86-64. AMD certainly doesn't call it that, and their CPUs implement almost exactly the same ISA (basically fully compatible unprivileged, a few minor diffs for kernels). – Peter Cordes Nov 12 '19 at 01:26
  • I think like a lot of people, I conflated ABI and ISA. – Olsonist Nov 13 '19 at 04:56
  • Ah. That makes sense. The ISA is kind of like an ABI for the kernel dealing with the hardware (but includes lots of rules for semantics of things because it involves memory ordering rules). A software ABI is of course just for software cooperating with other software (including the kernel), with the ISA taken as a given. – Peter Cordes Nov 13 '19 at 04:59