I understand what volatile does and what it doesn't do, taking the example from this question
void waitForSemaphore()
{
volatile uint16_t* semPtr = WELL_KNOWN_SEM_ADDR;/*well known address to my semaphore*/
while ((*semPtr) != IS_OK_FOR_ME_TO_PROCEED);
}
my question is: in the presence of cpu cache, volatile can not guarantee the above works, because it only forces cpu to read smePer from memory but cpu doesn't know if the memory is in RAM or one of the caches. Therefore if another device changed content of WELL_KNOWN_SEM_ADDR, waitForSemaphore won't necessarily know. So there must be something else make it work.
I have read this and this, it seems volatile itself is not enough to guarantee such program works, there must be some platform dependent magic that either by passes L1/2/3 cache or force flush them, am I right? If so are such support available on all popular platforms, for example x86?