What goes under the hood when a x86 processor accesses memory

Question

I'm going through different material to understand what's the behavior when a x86 instruction access memory (implicitly or explicitly). So far, I believe this is the step-by-step process:

Effective address is calculated based on the DS,
If page containing the address is not present:
- The TLB is checked for the page containing the address to be accessed,
  - If not present in the TLB: the MMU translates the virtual address to a physical address
  - add the new translation to the TLB
- Load the page
Access the page

Is this correct? What am I missing?

Terminology: "effective address" is just the "offset" part of a `seg:off` logical address like `ds:[edi + eax*4]`. What you meant to say is that segmentation produces a *linear* virtual address using the segment base, assuming paging is enabled. (In 64-bit long mode, paging must be enabled, but DS base is fixed at 0. FS and GS can still have non-zero bases, allowing their use for thread-local storage.) — Peter Cordes, Jul 26 '23 at 16:57
What do you mean "If page containing the address is not present:"? That's something the CPU figures out by consulting the TLB, falling back to walking the page table on TLB miss. If the page tables say the page isn't present, a #PF page fault exception is raised. Then the *OS* is responsible for updating the page tables (after loading the page if necessary) so a retry of the access won't page-fault. But that's software, not something the CPU does internally. — Peter Cordes, Jul 26 '23 at 17:01

score 2 · Accepted Answer · answered Jul 27 '23 at 01:57

2

Determine the effective address (i.e., offset).
Calculate the linear address based on the segment.
Simultaneously,
- Check the L1 cache for the line containing the address.
- Check the TLB for the page containing the address.
- If both of the above hit and the memory type and access type are cachable, perform the access to the cache. (I'm not going to go into the other memory types.)
If not present in the TLB (or insufficient permissions),
- Perform a page walk to translate the virtual address to a physical address.
- If the page walk fails, raise a page fault.
- If the page walk succeeds, add the translation to the TLB.
If the memory type and access type are cachable, acquire the cache line and perform the access to the cache.

answered Jul 27 '23 at 01:57

prl

11,716
2
13
31

Nitpick: L1 cache is not virtually addressed in any x86, so the TLB result is needed as part of determining an L1 hit. (Typically VIPT, so tags+data can be fetched from the set based on the index bits of the address in parallel with the TLB lookup, but a hit is determined by comparing tags). See [VIPT Cache: Connection between TLB & Cache?](https://stackoverflow.com/q/46480015). Outer levels of cache are PIPT so they just use the physical address generated as part of the attempt to access L1d, if the load misses in L1d. – Peter Cordes Jul 27 '23 at 02:06
(For a store, virtual and then physical address when available are written to the store buffer by the store-address uop, eventually committing the data from the store buffer to that line of L1d cache. The virtual address provides earlier detection of same-address or non-overlap for memory disambiguation / store forwarding.) – Peter Cordes Jul 27 '23 at 02:08

What goes under the hood when a x86 processor accesses memory

1 Answers1