0

Disclaimer: this is a repost from superuser, due to a comment that the question might be more suitable for Stack Overflow.

In Intel Software Developer's Manual (Intel 64 and IA-32 Architectures Software Developer's Manual, June 2023 edit version) Volume 3A Section 4.10.2.4, while explaining TLBs, the following is said about "Global Pages":

The Intel-64 and IA-32 architectures also allow for global pages when the PGE flag (bit 7) is 1 in CR4. If the G flag (bit 8) is 1 in a paging-structure entry that maps a page (either a PTE or a paging-structure entry in which the PS flag is 1), any TLB entry cached for a linear address using that paging-structure entry is considered to be global. Because the G flag is used only in paging-structure entries that map a page, and because information from such entries is not cached in the paging-structure caches, the global-page feature does not affect the behavior of the paging-structure caches.
A logical processor may use a global TLB entry to translate a linear address, even if the TLB entry is associated with a PCID different from the current PCID.

I understand that when the page is not global, the address mapping of that given page is local to the specific PCID, and that when the page is global, the address mapping holds for all PCIDs.
What I do not understand is when the software would ever use this feature. At first I thought it might be used for thread groups that share the same CR3 value (so that they are essentially in the same virtual address space), or might be used in cases like fork where a virtual address space is copied on write, and otherwise shared.

However, it doesn't make sense since it seems that the Global Pages actually need to apply to all processes, not just certain processes. Now my best guess is that it is used for kernel virtual address translations, because (I'm not 100% sure but) there's some fixed region in the virtual address space that is used by the kernel, that has the same translation for all processes.

Hence, my question is, am I understanding Global Pages correctly, and if so, when would the operating system ever make use of this feature?

Thanks in advance.

WannabeArchitect
  • 1,058
  • 2
  • 11
  • 22
  • 2
    The kernel needs its own pages mapped in every set of page-tables, with the "U/S" bit cleared so only ring-0 can use them. Making them Global avoids flushing the TLB entries for them on context switch. (The Meltdown vulnerability defeats the U/S bit, so these days it's normal that only a few pages like the IDT, GDT, and kernel entry points, are mapped while user-space is running.) – Peter Cordes Aug 31 '23 at 02:48
  • 1
    You're right they wouldn't work for regions of shared memory between processes, unless the kernel was prepared to `invlpg` them specifically after `mov` to `cr3` with a page table from a process that didn't have the same shmem region mapped at the same address. – Peter Cordes Aug 31 '23 at 02:51
  • 1
    Also related: [How does TLB differentiate between entries of different Page tables?](https://stackoverflow.com/q/70187392) has a footnote about the Global bit, and some discussion in comments about Meltdown (and the kernel using PCIDs to reduce the cost of changing page tables when entering the kernel for interrupts and system calls, not just on context switch.) – Peter Cordes Aug 31 '23 at 02:56
  • @PeterCordes Thank you so much! Definitely cleared things up for me :) – WannabeArchitect Aug 31 '23 at 03:25

0 Answers0