0

Here is the L3 cache (shared) configuration on my Intel Xeon Silver 4210R CPU-

$ getconf -a | grep LEVEL3_CACHE
LEVEL3_CACHE_SIZE                  14417920
LEVEL3_CACHE_ASSOC                 11
LEVEL3_CACHE_LINESIZE              64

This configuration implies that the number of sets in the cache is-

formula

Now I am trying to understand the addressing of the cache.

Here, the cache line (or the block) size is 64 bytes and intel uses the byte-addressable system. Therefore, the least formula significant bits of cache address should be used for block offset.

With a similar calculation, the number of address bits that should be used for set indexing is formula, but this fraction value confuses me.

Am I missing something? How many bits are exactly used here for set indexing?

Edit: Below Eric mentioned in his answer that each of the 10 processor cores shares 1.375MiB of L3 Cache. But such a configuration raises another question in my mind. Let's assume that, I am running two processes in core-0 and core-1. If both processes use virtual address 0x0, will those virtual addresses be mapped to the same core's L3 cache (assuming VIPT cache)? In other words, as the L3 cache is shared, which part of the virtual address distinguishes the core-0 L3 cache from the core-1 L3 cache?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
user3862410
  • 171
  • 1
  • 6
  • *How many bits are exactly used here for set indexing?* - That's a separate question; L3 caches use a hash function of higher bits to reduce aliasing conflicts from many addresses using the same offset relative to a page for example. See [According to Intel my cache should be 24-way associative though its 12-way, how is that?](https://stackoverflow.com/q/37162132) / [Determine Cpu cache associativity](https://stackoverflow.com/q/53764676) and [Which cache mapping technique is used in intel core i7 processor?](https://stackoverflow.com/q/49092541) – Peter Cordes Dec 16 '21 at 04:22
  • Maybe also relevant: [How do I see how many slices are in the last level cache?](https://stackoverflow.com/q/65195656). Re: outer caches being PIPT, see [How does the VIPT to PIPT conversion work on L1->L2 eviction](https://stackoverflow.com/q/55387857) – Peter Cordes Dec 16 '21 at 04:22

1 Answers1

2

Am I missing something?

This processor has 10 cores — your formula doesn't account for # of cores, so if you divide by 10 it is an even multiple of 2.

How many bits are exactly used here for set indexing?

11 bits, I believe


L3$ 13.75 MiB 10x1.375 MiB 11-way set associative write-back

read more here: https://en.wikichip.org/wiki/intel/xeon_silver/4210r

Erik Eidt
  • 23,049
  • 2
  • 29
  • 53
  • That raises another question in my mind. Say, I am running two processes in core-0 and core-1. If both processes use virtual address 0x0, will those virtual addresses be mapped to the same core's L3 cache (assuming VIPT cache)? In other words, as the L3 cache is shared, which part of the virtual address distinguishes the core-0 L3 cache from the core-1 L3 cache? – user3862410 Dec 16 '21 at 00:54
  • The way I read that text, it seems to indicates that the L3 caches are not shared among the 10 cores. Of course, I expect they communicate / snoop each other. I don't know the internal architecture of the caches (VIPT or other), perhaps someone more knowledgeable on these chips can answer that. – Erik Eidt Dec 16 '21 at 01:07
  • I am pretty sure that L3 caches are shared. I can literary observe (by monitoring performance counter) that one process running on a specific core can evict the L3 cache content of other processes running on a different core. Anyway, I am adding this part of the question to my main question. – user3862410 Dec 16 '21 at 03:26
  • The L3 slices form one large shared cache on Intel CPUs, although it's no longer inclusive (since Skylake-X). It's PIPT, like all caches other than L1 on mainstream CPUs. The indexing is a hash function of more than 11 bits, for various reasons; see my comment on the question. [How does the VIPT to PIPT conversion work on L1->L2 eviction](https://stackoverflow.com/q/55387857) / [Which cache mapping technique is used in intel core i7 processor?](https://stackoverflow.com/q/49092541) – Peter Cordes Dec 16 '21 at 04:24