For modern last level caches, they are divided according to slices
. But I read some introductions about it, and I still haven't been able to figure out how it is divided according to addresses.
This is an introduction to slices in a paper. The bits other than the line offset are used to hash to get the slice id. Of course, LLC is usually indexed by physical address. The parameters of my server cache are as follows. It has 24 physical cores, so it has 24 slices, and each slice is close to a core.
LEVEL1_ICACHE_SIZE 32768
LEVEL1_ICACHE_ASSOC 8
LEVEL1_ICACHE_LINESIZE 64
LEVEL1_DCACHE_SIZE 32768
LEVEL1_DCACHE_ASSOC 8
LEVEL1_DCACHE_LINESIZE 64
LEVEL2_CACHE_SIZE 262144
LEVEL2_CACHE_ASSOC 8
LEVEL2_CACHE_LINESIZE 64
LEVEL3_CACHE_SIZE 31457280
LEVEL3_CACHE_ASSOC 20
LEVEL3_CACHE_LINESIZE 64
LEVEL4_CACHE_SIZE 0
LEVEL4_CACHE_ASSOC 0
LEVEL4_CACHE_LINESIZE 0
It has two sockets, each socket has 12 physical cores.
NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42,44,46
NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43,45,47
According to the above parameters, my LLC size is 31457280Byte. Each cache line is 64Byte. It is a 20-way set-associated structure. So there are 31457280/64/20=24576
cache sets. Each SOCKET has 12 physical cores, and they share an LLC. Therefore, each slice has a total of 24576/12=2048
cache sets.
Which of my following understandings is correct? I prefer the first one to be correct.
The set index on each slice is independently numbered. Therefore, bits 6-16 of the physical address are used to index the cache set. Then use all the bits except the line offset to find the slice id through the hash.
The set index on all slices is numbered uniformly. But 24576 requires 14 bits for indexing (this does not seem to correspond exactly).