Are all processor cores on a cache-coherent system required to see the same value of a shared data at any point in time

Question

From what I've learnt, cache coherence is defined by the following 3 requirements:

Read R from an address X on a core C returns the value written by the most recent write W to X on C if no other core has written to X between W and R.
If a core C1 writes to X and a core C2 reads after a sufficient time, and there are no other writes in between, C2's read returns the value from C1's write.
Writes to the same location are serialized: any two writes to X must be seen to occur in the same order on all cores.

As far as I understand these rules, they basically require all threads to see updates made by other threads within some reasonable time and in the same order, but there seems to be no requirement about seeing the same data at any point in time. For example, say thread A wrote a value to a shared memory location X, then thread B wrote another value to X. Threads C and D reading from X must see the same order of updates: A, B. Imagine that thread C has already seen both updates A and B, while thread D has only observed A (the event B is yet to be seen). Provided that the time interval between writes to X and reads from X is small enough (less than what we consider a sufficient time), this situation doesn't violate any rules of coherence, does it?

On the other hand, coherence protocols, e.g. MSI use write-invalidation to guarantee that all cores have an up-to-date value of a shared variable. Wiki says: "The intention is that two clients must never see different values for the same shared data". If what I wrote about the coherence rules is true, I don't understand where this point comes from. I mean, I realize it's useful, but don't see where it is defined.

Somewhat similar to relativity in physics, there's a problem with defining "same time"; cores can't observe each other's effects instantly. In the CPU case, there can be times where only one core can actually read a line (e.g. it's in Modified state in one core's private L1d cache). Another core can always initiate a read, but it can't complete until after a share request is sent between cores. Related: my answer, and BeeOnRope's comments on [How does Intel X86 implements total order over stores](https://stackoverflow.com/posts/comments/110559173) — Peter Cordes, Jun 22 '20 at 19:18
Also somewhat related: [Globally Invisible load instructions](https://stackoverflow.com/q/50609934) — Peter Cordes, Jun 22 '20 at 20:00

Are all processor cores on a cache-coherent system required to see the same value of a shared data at any point in time

0 Answers0