5

As far as I know, a memory access of CPU involves CPU cache and MMU. CPU will try to find its target in cache and if a cache miss happens, CPU will turn to MMU. During accessing by MMU, the accessed/dirty bit of correspondent page table entry will be set by hardware.

However to the best of my knowledge, most CPU design won't trigger the MMU unless there's a cache miss, and here my problem is, will the accessed/dirty bit of page table entry still be set under a cache hit? Or it's architecture related?

shilovk
  • 11,718
  • 17
  • 75
  • 74
黄海鑫
  • 51
  • 1
  • 2

3 Answers3

6

I think you can assume these bits are cached in the TLB, and if there is any inconsistency with the values in the TLB and accesses done by the core, a microcode assist will be taken and the bits will be updated. For example, if the A1 or D bits are zero and an access or store happens, this condition will be detected and the appropriate bits will be set.

You can also assume that the fast path for TLB hits can't go to memory and see if the cached TLB bits are consistent with the PTEs in RAM. Furthermore, on x86 changes to PTE are not pushed, cache-invalidation style, to TLBs by hardware; that is, the TLB is not coherent.

This implies that if the bits are out of sync in certain ways, they will probably not be updated correctly. E.g., if the A (resp. D) bit is set in the TLB, and an access (resp. store) occurs, nothing will happen, even if the A (resp. D) bit is actually unset in the PTE. The entity making changes to the bits is responsible for flushing TLBs so that the bits are correctly updated in the future.


1 Having a TLB entry with A == 0 is weird: you'd expect the entry to be there as a result of an access, so having the A bit set from the start. Perhaps there are some scenarios where this might occur, such as a page brought in by a speculative access or prefetch.

BeeOnRope
  • 60,350
  • 16
  • 207
  • 386
  • System software may clear the A, then check a while later to see if it has been set, to determine the which pages are inactive, thus front of the q for eviction. – mevets Jan 27 '20 at 20:59
  • @mevets - yes, that's exactly the primary scenario I am talking about the last two paragraphs: where the page bits change under "the covers" and the TLB doesn't know about it. In this case, the system software must do a shootdown if they want the A bit to be set reliably. – BeeOnRope Jan 27 '20 at 21:46
2

Most caches are virtually indexed and physically tagged, for faster access. So the CPU issues the virtual address and index bits of the address is used to locate the entry. During this time the address is sent to TLB for getting the physical address. By the time cache has located the entry, TLB will return with the physical address which is then used for TAG comparison. Now two things can happen.

  1. TLB could not have the entry (TLB miss)
  2. Cache TAG mismatch (Cache miss)

In the case of 1, you need to access the page table entry (PTE) to get the correct physical address.

In the case of 2, if TLB has returned a valid mapping, you just need to fetch it. If TLB also has a miss (i.e, 1 and 2), then you need to get the physical address from PTE and fetch the data.

So to answer your question, in case of a HIT, PTE doesn't need to know about it all.

Fere
  • 105
  • 5
Isuru H
  • 1,191
  • 1
  • 8
  • 11
  • Thx, so the answer is, bits of PTE won't be set if a cache hit and a TLB hit both happen? – 黄海鑫 Apr 10 '17 at 15:43
  • To the best of my knowledge, yes. – Isuru H Apr 10 '17 at 15:57
  • 1
    You can't get 1 and 2 at the same time. You can have 1 *and then* 2 on the same access, when it retries after the TLB entry is ready. Then it's just case 2. Without a translation, you don't have anything to check tags against so you don't know if it's a hit or miss, or where to fetch data from. It might even be an unmapped address where there's no translation (-> page fault). – Peter Cordes Jan 25 '20 at 07:39
1

You usually can't have a cache hit if the page was never accessed in the first place, so that question is irrelevant. (Edit: come to think of it, it may be possible in some bizarre cases of page aliasing, but the same answer for the dirty bit applies there)

It is possible to have a cached line from a clean page (never written to previously). It's a little uncommon since you usually need to initialize data before accessing it, but the page could have been swapped out previously and then reinstalled into the page map (the exact behavior would be OS dependent but it is possible).

In that case, the line is cached (let's say exclusively), and you write to it. The CPU would access the cache and the TLB in parallel, attempting to lookup the line in the cache while also doing a TLB access to verify the full physical address, assuming your system is virtually indexed - physically tagged as most CPUs are these days. The TLB process may complete either through a TLB hit, or a miss followed by a page walk to install a TLB entry from the actual page map in the memory.

The cache access cannot complete until the TLB access (and page walk if necessary) is done, at which point you will know the value of the access/dirty bits. If you are trying to write to a page without the dirty bit set (or access a page without the access bit) - you will receive a page fault, triggering the OS to go and update the page in page table. The OS may choose to do various optimizations at this point, but it will eventually result in correcting these bits.

Leeor
  • 19,260
  • 5
  • 56
  • 87
  • 2
    Thank you, bu I think OS will occasionally clear the accessed bits in page table (for example during page reclaim), so there may be chances that pages cached have their accessed bits unset. I am not sure I fully understand your answer, but did you mean accessing a page without dirty/accessed bit set will cause a page fault? – 黄海鑫 Apr 10 '17 at 15:40
  • Yes. If the code accesses such a page you would get a fault. On the other hand, if the OS changes the page map somehow (maybe in an external process or in the kernel), you'll need to force an update on the copies cached in the TLB, so you'd normally get a TLB shootdown – Leeor Apr 11 '17 at 20:10
  • Virtually tagged? No, modern x86 L1d caches are VIPT (usually with sufficient associativity to make them also PIPT: no aliasing but they still get to do the TLB lookup in parallel with fetching tags from the indexed set.) Most other ISAs use the same trick, sometimes requiring the OS to do page coloring to avoid aliasing. Tags are definitely physical so cache doesn't have to be cleared when changing top-level page table on context switch. And so different logical cores on the same physical core and just competitively share cache, even if they're using different page tables. – Peter Cordes Jan 25 '20 at 07:31
  • @PeterCordes: fixed, thanks. That was of course a mistake. The whole point in parallel TLB/cache access is to make use of what you already have ahead of translation. – Leeor Jan 27 '20 at 20:48
  • 1
    *If you are trying to write to a page without the dirty bit set (or access a page without the access bit) - you will receive a page fault* - Maybe on some ISAs? On x86 those bits are *written* by hardware. See BeeOnRope's answer to this question. – Peter Cordes Jan 28 '20 at 05:22
  • Also, for a store, the TLB access has to happen during execution of the store-address uop (checking the address and writing the result in the store buffer). But the actual access to L1d doesn't happen until after it retires and it's time to commit to L1d cache. The TLB in parallel with cache tag access I think only happens for loads, assuming a pipelined CPU with a store buffer. On CPUs where VIPT = PIPT because they're associative enough, you can index the cache using just the physical address for stores. I'm not 100% sure if the store buffer might still need the virtual address. – Peter Cordes Jan 28 '20 at 05:23