2

I think the store queue is used in modern Intel processors for storing both memory address and store data. They do not go to L1 cache till the commit stage. But I am not 100% sure if it is correct or store is allowed at execution stage i.e., before commit stage. Regarding load, I do not know if there is any load queue, load-store queue or any other structure or the load is allowed to happen from L1 cache at the execution stage i.e., before the commit stage.

I also want to know if there is any Memory dependence predictor in Intel processors that can predict true dependencies between load and store before their addresses are known.

Please help me to clarify my doubts.

Answer: The duplicate link has a long answer about a different question which also answers my question. The answer is that the load is allowed to happen speculatively but store buffer does not store anything till the retire stage. It is probably because we assume that the outside world is allowed to influence the CPU (CPU can ignore it if found wrong) for load but CPU is not allowed to influence outside world by speculative store (happens only when 100% sure).

This short answer is for those who wants to know the answer of my question quickly without going to another link and reading that long answer. The link does not answer about Memory dependence predictor in Intel which is different from memory disambiguation.

  • Can you rephrase this as a computer programming question? What is the programming problem that this information would help with? – Raymond Chen May 15 '19 at 03:08
  • This is a question about the architecture and I do not know how to ask it as a programming problem. If you know the answer, please answer. – world_of_science May 15 '19 at 05:07
  • This site is for computer programming questions. If architectural question affects programming, that would make it on topic. But it seems that when the store goes to L1 doesn't affect the programming model – Raymond Chen May 15 '19 at 05:20
  • My answer on [Regarding instruction ordering in executions of cache-miss loads before cache-hit stores on x86](//stackoverflow.com/q/56070123) pretty much covers speculative loads and stores, and how the store buffer decouples speculative execution of stores from commit to L1d after they're known to be non-speculative. – Peter Cordes May 15 '19 at 07:07
  • And yes, there is dynamic prediction for memory dependencies / memory disambiguation. I'm not sure how much is known about the actual implementation details. – Peter Cordes May 15 '19 at 07:07
  • Nice update, that's a good summary of the answer to your question. You might want to ask a separate question *just* about memory dependency prediction, so that can get answered separately from this one. Ideally include some real-world code that would be sped up by accurate dependency prediction, so it's clear what kind of thing you're asking about. (That would also make it a better fit for Stack Overflow, although IMO questions about the microarchitecture of *real* CPUs are fine.) – Peter Cordes May 15 '19 at 08:37

0 Answers0