The paper C++ and the Perils of Double-Checked Locking by Scott Meyers and Andrei Alexandrescu describes the case when threads running on different cores may not see the same values of shared variables:
Suppose you’re on a machine with multiple processors, each of which has its own memory cache, but all of which share a common memory space. Such an architecture needs to define exactly how and when writes performed by one processor propagate to the shared memory and thus become visible to other processors. It is easy to imagine situations where one processor has updated the value of a shared variable in its own cache, but the updated value has not yet been flushed to main memory, much less loaded into the other processors’ caches. Such inter-cache inconsistencies in the value of a shared variable is known as the cache coherency problem.
Suppose processor A modifies the memory for shared variable x and then later modifies the memory for shared variable y. These new values must be flushed to main memory so that other processors will see them. However, it can be more efficient to flush new cache values in increasing address order, so if y’s address precedes x’s, it is possible that y’s new value will be written to main memory before x’s is. If that happens, other processors may see y’s value change before x’s.
On the other hand, I know that processors maintain a coherent view of the data values in multiple caches by means of coherence protocols, which can write-update or write-invalidate the copies of shared data in other cores. According to the Wikipedia, two clients must never see different values for the same shared data. Is it guaranteed?
If coherence protocols ensure that two cores will always see the same value of a shared variable, how is it possible that the situation described in the paper occurs?