I am asking this question regarding Haswell Microarchitetcure(Intel Xeon E5-2640-v3 CPU). From the specifications of the CPU and other resources I found out that there are 10 LFBs and Size of the super queue is 16. I have two questions related to LFBs and SuperQueues:
1) What will be the maximum degree of memory level parallelism the system can provide, 10 or 16(LFBs or SQ)?
2) According to some sources every L1D miss is recorded in SQ and then SQ assigns the Line fill buffer and at some other sources they have written that SQ and LFBs can work independently. Could you please explain the working of SQ in brief?
Here is the example figure(Not for Haswell) for SQ and LFB.
References:
https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf