If I understood correctly, a stall/freeze in a pipeline would cause a clock cycle to go waste. If a few cycles (out of billions in a second) go waste, it may not be a big deal or even measureable in terms of performance. But I was curious what operations would cause a pipeline bubble. Would a load barrier result in halting the pipeline - because data needs to be fetched from the memory/level-3-cache, or bubble is only created because current pipeline has a dependency on the result of the last pipeline? Something like:
int score = randGen.nextInt(6);
int pay = score * 100; //Dependent on the result of the last pipeline
If my above assumption of memeory barrier causing a bubble is incorrect, does that mean incorrect branch prediction, false memory sharing and load/store barriers leave CPUs worse off (fetching data from memory and flushing of load and store buffers)?