Can one CPU core observe others' modification immediately?

Question

Let's assume that core A modified a word in memory, then core B try to load the same word. In this case, May core B get a stale value?

According to my understanding, this is possible. For example, the cache invalidation message from core A is queued into the invalidation queue of core B but not complete yet.

But I never observe this when testing it using two threads: one writes the data and another reads it and then check the order.

How do you enforce the "then" in *then core B tries to load ...*? The normal definition of "after" means after A's store becomes globally visible (by committing to L1d cache), so (because cache is coherent), you can't see a stale value. If by chance the load happened to execute a couple clock cycles after the store, sure it could see the value because the store wouldn't even have retired from the out-of-order back-end in Core A. — Peter Cordes, Aug 03 '20 at 16:15
A more relevant question would be, suppose that thread A updates location X and then it updates location Y. Would thread B, which is continually polling both of those memory locations, see the updates happen in the same order? On many (most?) multi-processor systems, if no memory barrier instructions are used by either thread, then thread B could see the updates happen out-of-order. — Solomon Slow, Aug 03 '20 at 17:40
@SolomonSlow: True, but not on x86-64. x86's memory model is program-order + a store buffer with store forwarding, so only StoreLoad reordering is possible. (And store-forwarding effects if a thread reloads its own recent stores). https://preshing.com/20120930/weak-vs-strong-memory-models/. x86-64 does release/acquire for free, only needing barriers for seq_cst. — Peter Cordes, Aug 03 '20 at 17:58
@Peter Cordes: yes, you are correct. The problem is L1D is a local cache. Considering below sequence: core A update location X in its local cache, and send a cache line invalidation message to core B; 2.core B received the message but delayed the real invalidation op by queuing it to its Invalidation Queue; 3. Before the invalidation takes place, core B try to read X. Since core B have a out of date copy of X, so I think core B possibly get stale data. Right? Or will core B check its invalidation queue before load X? — Changbin Du, Aug 04 '20 at 14:22
The L1d caches of all cores are *coherent*. Committing to L1d can't happen until all other copies of the line are already invalidated (MESI), and this core has received an acknowledgement of it. (i.e. a reply to its RFO read-for-ownership). If core B delays processing and replying to the invalidation, that delays core A from committing to L1d cache. See [The ordering of L1 cache controller to process memory requests from CPU](https://stackoverflow.com/q/38034701) — Peter Cordes, Aug 04 '20 at 14:29

Can one CPU core observe others' modification immediately?

0 Answers0