0

I'm going through different material to understand what's the behavior when a x86 instruction access memory (implicitly or explicitly). So far, I believe this is the step-by-step process:

  • Effective address is calculated based on the DS,
  • If page containing the address is not present:
    • The TLB is checked for the page containing the address to be accessed,
      • If not present in the TLB: the MMU translates the virtual address to a physical address
      • add the new translation to the TLB
    • Load the page
  • Access the page

Is this correct? What am I missing?

Franks
  • 50
  • 8
  • Terminology: "effective address" is just the "offset" part of a `seg:off` logical address like `ds:[edi + eax*4]`. What you meant to say is that segmentation produces a *linear* virtual address using the segment base, assuming paging is enabled. (In 64-bit long mode, paging must be enabled, but DS base is fixed at 0. FS and GS can still have non-zero bases, allowing their use for thread-local storage.) – Peter Cordes Jul 26 '23 at 16:57
  • What do you mean "If page containing the address is not present:"? That's something the CPU figures out by consulting the TLB, falling back to walking the page table on TLB miss. If the page tables say the page isn't present, a #PF page fault exception is raised. Then the *OS* is responsible for updating the page tables (after loading the page if necessary) so a retry of the access won't page-fault. But that's software, not something the CPU does internally. – Peter Cordes Jul 26 '23 at 17:01

1 Answers1

2
  • Determine the effective address (i.e., offset).
  • Calculate the linear address based on the segment.
  • Simultaneously,
    • Check the L1 cache for the line containing the address.
    • Check the TLB for the page containing the address.
    • If both of the above hit and the memory type and access type are cachable, perform the access to the cache. (I'm not going to go into the other memory types.)
  • If not present in the TLB (or insufficient permissions),
    • Perform a page walk to translate the virtual address to a physical address.
    • If the page walk fails, raise a page fault.
    • If the page walk succeeds, add the translation to the TLB.
  • If the memory type and access type are cachable, acquire the cache line and perform the access to the cache.
prl
  • 11,716
  • 2
  • 13
  • 31
  • Nitpick: L1 cache is not virtually addressed in any x86, so the TLB result is needed as part of determining an L1 hit. (Typically VIPT, so tags+data can be fetched from the set based on the index bits of the address in parallel with the TLB lookup, but a hit is determined by comparing tags). See [VIPT Cache: Connection between TLB & Cache?](https://stackoverflow.com/q/46480015). Outer levels of cache are PIPT so they just use the physical address generated as part of the attempt to access L1d, if the load misses in L1d. – Peter Cordes Jul 27 '23 at 02:06
  • (For a store, virtual and then physical address when available are written to the store buffer by the store-address uop, eventually committing the data from the store buffer to that line of L1d cache. The virtual address provides earlier detection of same-address or non-overlap for memory disambiguation / store forwarding.) – Peter Cordes Jul 27 '23 at 02:08