i7 Nehalem/Westmere L1 instruction cache

Question

I was reading the Hennessy and Patterson book: "Computer architecture: a quantitative aproach" and I found this:

and this:

"Notice that for the four-way associative instruction cache, 13 bits are needed for the cache address: 7 bits to index the cache plus 6 bits of block offset for the 64-byte block, but the page size is 4 KB = 212, which means that 1 bit of the cache index must come from the virtual address. This use of 1 bit of virtual address means that the corresponding block could actually be in two different places in the cache, since the corresponding physical address could have either a 0 or 1 in this location. For instructions this does not pose a problem, since even if an instruction appeared in the cache in two different locations, the two versions must be the same."

But I don't understand it, I don't see any problem here beyond that odd pages will be mapped in the upper half of the cache sets. This is a VIPT cache so we will find the entire physical TAG in the, won't we? So... where is the problem?

Why do you think this has anything to do with Intel i7 CPUs? [Which cache mapping technique is used in intel core i7 processor?](https://stackoverflow.com/q/49092541) Unlike AMD, Intel has kept their L1i/d caches small enough and associative enough not to have that complication (all the index bits come from the page-offset bits which are the same in virt and phys). This case of having 1 index bit come from the virtual page number is the same as [Virtually indexed physically tagged cache Synonym](https://stackoverflow.com/q/46588219) - it creates a possible "synonym" aliasing problem. — Peter Cordes, Jun 18 '20 at 17:46
*so we will find the entire physical TAG in the, won't we* - not if the index points you to a different set, if two processes have the same phys page mapped to different virtual addresses. — Peter Cordes, Jun 18 '20 at 17:47
Well, the book (Computer architecture: a quantitative approach) was talking about i7 CPUs and talking about they have this "problem" on their instructions caches. You've mentioned "two processes have the same phys page mapped to different virtual addresses", when is it possible? I can only think on a process and it's child, I'm in the correct way? — isma, Jun 18 '20 at 17:53
Most often with `mmap(MAP_SHARED)` on a file for two user-space processes. Or even more simply, the Linux kernel direct-maps all physical RAM with hugepages as well as mapping the same pages to user-space virtual addresses (and/or to other kernel virtual addresses with `vmalloc`). https://www.kernel.org/doc/Documentation/x86/x86_64/mm.txt. — Peter Cordes, Jun 18 '20 at 17:59
It seems first-gen i7 series CPUs like Nehalem/Westmere did have only 4-way associativity for L1i-cache: https://www.7-cpu.com/cpu/Westmere.html, and https://www.realworldtech.com/nehalem/4/ agrees. Interesting. Every later i7 CPU (starting with https://www.7-cpu.com/cpu/SandyBridge.html) has had 8-way L1i / L1d. (This is why "i7" is such a useless way to describe a CPU; it spans Nehalem through current Ice Lake.) — Peter Cordes, Jun 18 '20 at 18:02
But more simplier than that: if I include "fork()" in my code and I don't modificate any value of any variable, the virtual pages of both processes (parent and child) will reference the same physical pages, won't them? @PeterCordes — isma, Jun 18 '20 at 18:03
They, but just like with threads in a single process, the *same* virtual address maps to the same physical page. That can't create problems. — Peter Cordes, Jun 18 '20 at 18:05

i7 Nehalem/Westmere L1 instruction cache

0 Answers0