2

I'm reading about huge pages in Linux, where the idea is using, say, 2MiB page size instead of 4KiB page size, to reduce TLB misses. I understand that modern CPUs have both data and instruction TLBs, and have separate TLBs for huge pages. What I don't understand is how does the CPU know that a given virtual address is actually pointing to a huge page? It can't just be alignment because there's no guarantee that there are any huge pages allocated at all. Does this mean that the CPU always have to look in both the 4K and the 2M TLBs, just in case? Or what's the mechanism being used?

A related question is, how does Linux handle page tables when a process uses a mixture of 4K and 2M pages? I mean, it can't just mix them in the same page tables, right?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Lajos Nagy
  • 9,075
  • 11
  • 44
  • 55
  • 1
    The page-table format in memory is unambiguous. Are you just asking how a TLB can be designed to cache that info? Obviously if the TLB has been kept up to date with changes to the page tables, there can't be TLB hits for both a 2M entry and a 4k entry, since page walking would have stopped at the 2M level on the entry that's a largepage instead of a pointer to a table of 4k PTEs. See [Why in x86-64 the virtual address are 4 bits shorter than physical (48 bits vs. 52 long)?](https://stackoverflow.com/q/46509152), and note the PDE bit that marks one as a 2M page or a pointer to a page table – Peter Cordes Jan 29 '22 at 04:58
  • 2
    But yes, CPUs with different L1dTLBs for different page sizes will probe them in parallel. Related: [TLB usage with multiple page sizes in x86\_64 architecture](https://stackoverflow.com/a/29155014) / [Understanding TLB from CPUID results on Intel](https://stackoverflow.com/q/58128776) and especially **[Address translation with multiple pagesize-specific TLBs](https://stackoverflow.com/q/49842530)** - As Hadi says: *All TLBs are looked up in parallel. There can either be a single hit or all misses*. (Or on multiple hits (from stale TLB entries), the CPU may choose one.) – Peter Cordes Jan 29 '22 at 05:03

0 Answers0