Does x86 provide instructions to load which data goes into cache?

Question

There are prefetch instructions as mentioned here: https://c9x.me/x86/html/file_module_x86_id_252.html which allows a system program to HINT the cpu as to which data SHOULD go into the cache. But, if is a single simple program loaded by the bootloader as an operating system - something as simple as MikeOS - then, can these hints make this almost certain as to which data sits on cache since only a single process is dictating the cpu - or still the cpu is doing branch predictions on its own? Does it mean that cpu's internal branch prediction is prefered for this purpose?

Don't think so, no. Besides, the performance penalty of allowing it would probably far outweigh any gains. How would you ensure that your caching policy is the best for all CPU architectures, all cache sizes, etc? — fredrik, Jan 06 '18 at 10:02
x86 has [`prefetch` instruction](https://c9x.me/x86/html/file_module_x86_id_252.html), but usually the CPU is smart enough to recognize access patterns itself, so you would have to measure carefully the performance to see if it truly helps, or just adds useless opcode into the binary. — Ped7g, Jan 06 '18 at 10:07
@fredrik Thanks for your comments, but could you expand a little bit on what could be some performance penalties? — Yahya, Jan 06 '18 at 10:17
The most common performance penalty is bad algorithms. It sounds like you're aiming for something that would almost certainly be premature optimization — fredrik, Jan 06 '18 at 10:18
yes. x86 SSE and NIOS II are some examples. Possible duplicate of [Are there any such processors which have instructions to bypass the cache?](https://stackoverflow.com/q/17093509/995714) — phuclv, Jan 06 '18 at 10:20
since this is such a wide open question (not a stackoverflow question) the answer is absolutely. If you are in control of the processor and understand the architecture then you can control most if not all of what is in the caches. Even when not completely in control you can encourage things to be in cache but you are competing with other code and if you dont know/understand the other code or what it is doing it determines how little control you may have. — old_timer, Jan 06 '18 at 14:17
You should ask a real question, and be more clear as to what you are after. Need to specify your architecture, system, and operating system. And at that point you can just look up the answer. Caches simply respond to what the bus tells them to do. If running an operating system like windows, linux, etc you have an mmu that in part controls which address spaces are cacheable, which is mostly if not completely driven by the operating system. despite that though you could do some striping at the application level to erase things in cache that you have read. if lucky — old_timer, Jan 06 '18 at 14:22
if this is yet another I meltdown/spectre question, this is not how you work around them. — old_timer, Jan 06 '18 at 14:24
Possible duplicate of [Are there any such processors which have instructions to bypass the cache?](https://stackoverflow.com/questions/17093509/are-there-any-such-processors-which-have-instructions-to-bypass-the-cache) — phuclv, Jan 07 '18 at 01:34
x86 has `clflush` and `prefetch` instructions to evict or cache specific cache lines, but usually caches work fine on their own (with HW prefetch, or simply from demand loads bringing data into cache), with an LRU eviction strategy. — Peter Cordes, Jan 07 '18 at 06:02
don't make an edit that makes the meaning of the question changes significantly. If you have a different question, ask another one — phuclv, Jan 08 '18 at 02:06
@LưuVĩnhPhúc: The first version of the question was not useful or interesting. Your rollback invalidated my answer. Perhaps this question is just unsalvageable, though. — Peter Cordes, Jan 08 '18 at 02:23

score 2 · Answer 1 · answered Jan 08 '18 at 00:59

The branch-target buffer (and other branch-prediction caches like the branch history buffer) are totally separate from the L1D / L1I / L2 / L3 memory caches.

x86 has no instructions to manage the branch-prediction caches, only the data / instruction caches (which is what most people mean when they just say "cache"). AFAIK, no other architectures have cache-control instructions for the branch prediction caches either, but you seem to be only discussing x86.

I'm also not sure whether you're actually trying to ask about the branch-prediction caches or whether you're just totally confused about how CPUs work. See What Every Programmer Should Know About Memory by Ulrich Drepper, and a 2017 update on a few minor points. What Every Programmer Should Know About Memory?

This is pretty much all it's possible to say in response to such a mixed-up question, apart from the links in comments about controlling the contents of the data / instruction caches (which as I said is what most people mean when they talk about loading data into caches).

Does x86 provide instructions to load which data goes into cache?

1 Answers1