0

While studying virtual memory concepts, I understood that a virtual address (generated by a processor to access memory location) contains page number and page offset. we use a page table to get the physical address (frame number essentially) corresponding to this page number.

Now, if these addresses (physical/virtual) operate in terms of pages/frames, how does the processor access a cache which operates in terms of blocks/lines?

Also, if the virtual address consists of only page number and page offset, where does the tag bits come from which is used to check if the cache set (specified by index/set bits) contains the required data or not?

jhagk
  • 111
  • 1
  • 9
  • Different uses of the same address can break it up different ways. If you insist on thinking of it like a C data structure with fixed fields, think of it as a `union {}` of multiple different structs. – Peter Cordes Oct 02 '20 at 05:57
  • @PeterCordes Can you explain how same bits can be used for accessing two different addressing schemes? Where exactly is the overlapping between these two types of addresses? Type 1: PageNumber + PageOffset Type2: Tag + Set/Index + Offset – jhagk Oct 02 '20 at 06:01
  • [Virtually indexed physically tagged cache Synonym](https://stackoverflow.com/q/46588219) shows an example of a system breaking up addresses in 2 different ways, for paging (page number / page offset) and for cache (tag / index / offset) – Peter Cordes Oct 02 '20 at 06:47
  • Or in general for caches, [Cache Addressing: Length of Index, Block offset, Byte offset & Tag?](https://stackoverflow.com/q/14259088) . Also [How to compute cache bit widths for tags, indices and offsets in a set-associative cache and TLB](https://stackoverflow.com/q/47747772) has a diagram of breaking up and translating a virtual address. – Peter Cordes Oct 02 '20 at 06:51
  • @PeterCordes Is it safe to assume that the data transfers between cache and RAM takes place in terms of blocks, and transfers between RAM and disk take place in terms of pages? One more question, when CPU issues a memory access request, what unit of data it tries to operate on? Byte/Word or block or page? I assume it's just trying to access a byte/word. – jhagk Oct 02 '20 at 07:10
  • Yes, to both. SDRAM including modern DDR4 sends 1 read or write request for a burst transfer of a whole 64-byte block, unless it takes special steps to do a smaller transfer. (This being the same size as cache lines in almost all CPUs is not a coincidence). See [What Every Programmer Should Know About Memory?](https://stackoverflow.com/q/8126311) – Peter Cordes Oct 02 '20 at 07:13
  • @PeterCordes Thanks a ton. When I sat with pen and paper and created an example myself, this concept of the same address being used address both TLB and cache is crystal clear to me. Essentially, higher-order bits from TAG bits gives the page number, and lower order TAG bits along with SET and OFFSET bits yields the page offset. – jhagk Oct 03 '20 at 04:19

1 Answers1

0

I figured out the answer to this question.

  1. Same address can be used/interpreted for accessing two different addressing schemes. (Thanks @PeterCordes for pointing this out)

    • Scheme 1 (To access the TLB): PageNumber + PageOffset
    • Scheme 2 (To access the cache): Tag + Set/Index + Offset
  2. Usually in VIPT caches, the page number comes from higher-order TAG bits, and the page offset comes from lower-order TAG bits along with SET and OFFSET bits. To prevent aliasing (multiple virtual addresses mapping to same physical address), it is important that SET/INDEX bits come fully from page offset. This restriction limits the size of the cache.

jhagk
  • 111
  • 1
  • 9
  • In most modern systems, the L1d cache index and offset bits only come from the page offset, allowing VIPT speed with the immunity to aliasing of a PIPT. (You can have caches that use virtual tags, but that's rare these days because they'd generally have to be flushed on context switch.) [How does the VIPT to PIPT conversion work on L1->L2 eviction](https://stackoverflow.com/a/55389830) – Peter Cordes Oct 03 '20 at 04:50
  • But it's also possible to have a VIPT cache where the index includes some bits from the page number, as in [Virtually indexed physically tagged cache Synonym](https://stackoverflow.com/q/46588219) where an OS might use page coloring to avoid synonyms. In such a system, it would also be normal for the cache tags to include a bit that was also part of the index, i.e. the whole page number, so write-back of dirty lines to L2 can generate the full physical address without needing the TLB to translate the index bit(s) that came from the virtual address. (See link in previous comment) – Peter Cordes Oct 03 '20 at 09:26
  • So it's not accurate to say that the page offset has to include any tag bits, or to say that the page number can't include any index bits. Especially if we're talking about PIPT caches (like normal for L2 and other outer caches) where the whole concept of "pages" isn't very important anymore. Also, even L1d tags are often the physical address, (VIPT caches), so it's not accurate to say that the tag comes from the same address bits as the page number. Page number usually means virtual, but if tags are physical then they're (part of) the page *frame* number. – Peter Cordes Oct 03 '20 at 09:30
  • *Usually in VIPT caches, the page number comes from higher-order TAG bits* - Nope, that's the opposite of what I said. Physically tagged means the tags are physical addresses. The actual tag bits *come from* the result of translating the page number to a page-frame number. (And if the cache is small, also some page-offset bits which didn't need to be translated.) Only in a virtually-tagged cache does any of the tag come from the virtual page number. – Peter Cordes Oct 03 '20 at 10:54
  • _Physically tagged means the tags are physical addresses._ - Yes, but understand the fact that, the virtual address can only contain page information, not the frame information. Physically tagged means the tag comes from the physical address and this physical address (frame number etc.) comes after TLB lookup. When I say _'the page number comes from higher-order TAG bits'_, I mean TAG bits in the Virtual address. Essentially, higher-order TAG bits of the virtual address contains the PAGE number, and the TAG bits in the physical address contain the FRAME number. – jhagk Oct 03 '20 at 11:05
  • You're using "tag" as a synonym for "high". But that's not correct. For example the high bits of a virtual address *aren't* the tag bits. You're trying to simplify more than is possible, and are inventing confusing terminology (like TAG = high) in the process. What you're saying is often true for L1 caches (especially in designs that use VIPT L1d caches with all the page-offset bits being used for cache index and offset, so tag exactly equals page-frame number), but you're potentially making it harder for yourself to imagine other possible designs which exist in real life. – Peter Cordes Oct 03 '20 at 11:18