When a process is context switched, its state must be stored somewhere so that it can start from the same point later on. The state of a running process would include the data stored in the caches and the store buffer. On x86, cache-coherency protocols will ensure that the state of the program which is stored in caches, becomes visible to all the other cores. However, this is not true for store buffers as they are not coherent like the rest of the memory system by design. A write to the memory will first land up in the store buffer, before it gets drained out to the coherent caches. As a result, when a program is running, some of its state which resides in the store buffer, may not be made immediately visible to all the cores. As a result when this process gets context switched and gets assigned to a different core, it is possible that the new core hasn't seen the writes done by the process on the older core, as they might still be sitting in the store buffer of the older core. Therefore, there is a possibility of loss of state during context switches. That's why I think that the store buffer must be flushed during context switches to make the writes visible to the whole system.
For example, suppose that the process is running the following pseudocode on Core-0:
bool x = false; // Assume that x is NOT register allocated and rather it is sitting somewhere in the memory
// Some random work
x = true;
/*-------------- Context Switched and assigned Core-1 for execution ---------------------*/
if(x)
{
// Do something
}
else
{
// Do something else
}
In this case the write done to the variable x
(x = true
) will be put on the the store buffer of Core-0 (as the program was running on Core-0 initially). Now after the context switch, the program is assigned to Core-1 which may not be aware of the write done to the memory for x = true
. Since Core-1 has no idea that the value of x has changed, the program may end up on the else
path whereas it should have ended up on the if
path (as per the logic of the source code).
Clearly, in this example, if the state of the program pertaining to the value of x
is not saved during the context switch, the program may end up executing the branch incorrectly.
However, I have never seen this happen in real life. So I hypothesize that the OS must be somehow ensuring that the program state in the store buffer gets broadcasted to the whole system to ensure correctness. One way to do this is to add a mfence
instruction in the context switch code which would ensure that the store buffer gets drained during the context switch. However, while examining the source code of Linux, I was not able to find mfence
or any other instruction which would flush the store buffer. So that's why I am asking whether the store buffer gets flushed during context switches or not? And if it is not flushed, how is the OS able to ensure correctness during context switches?