1

I am reading http://www.realworldtech.com/sandy-bridge/ and I face some problems in understanding some issues:

The load buffer grew by 33% and can track 64 μop in-flight. Sandy Bridge increased slightly to 36 stores, for an overall 100 simultaneous memory operations, roughly two thirds of the number of the total μop in-flight.

What are μops in-flight? What is a load buffer?

Michael Petch
  • 46,082
  • 8
  • 107
  • 198
Gilgamesz
  • 4,727
  • 3
  • 28
  • 63
  • 1
    When they say "uop," they probably mean "μop" for micro-op. You can think of a load buffer as a cache to hold results of load instructions (likewise for store buffers). More details found in the answer [here](http://stackoverflow.com/questions/11105827/what-is-a-store-buffer). – Jeff Mercado Apr 19 '16 at 21:24
  • 1
    In-flight μops are recorded in the re-order buffer (168 entries in Sandy Bridge). When an operation has its operands renamed and is inserted into the scheduler, it enters the ROB; when the operation is committed (in-order), the ROB entry is freed and the operation is no longer "in-flight". The load buffer tracks load addresses and their order to support store-forwarding (as well as detect mispredictions that a load is not dependent on earlier unresolved stores). –  Apr 20 '16 at 03:38

1 Answers1

4

Sandy Bridge processors are Out-of-Order processors (OOO). What this means is that the processor will try to execute instructions in the stream of instructions as soon as they can be executed, regardless of the order in which the program text says to execute them (with a lot of caveats around the fact that the re-ordering cannot change the observable results. E.g. dependencies have to be available before the actual execution happens).

So, as instructions gets decoded to micro-ops (uops), they are considered for execution. The processor has a maximum number of uops it can have in the various stages of execution. that's the uops that are in-flight.

A load buffer is a temporary storage location for the result of load uops. Since many can execute in parallel, they need to know up-front where they'll hold the data when it gets back from the memory subsystem. Having 64 entries means you can have 64 load uops executing "concurrently".

Bahbar
  • 17,760
  • 43
  • 62
  • More accurately - you can have 64 load uops waiting for execution or commit at any given moment. They don't have to execute together, but the fetch unit can see past them in case some younger operation may be performed independently. – Leeor Apr 21 '16 at 06:00
  • @Leeor: right, that's what I called "various stages of execution". Which I realize is technically improper, but makes the point to somebody not versed in OOO. Using "commit" without explaining its meaning is defeating that goal. – Bahbar Apr 22 '16 at 08:03
  • So, in fact: When (for example): mov rcx, [rax] then this instruction ( probably ) will be divivded to Load microps and this instruction gets data from memory ( RAM/Cache) and the data will be placed in load buffer, am I right? – Gilgamesz Apr 22 '16 at 14:44