As per this question's answer, it seems that LOCK CMPXCHG on x86 actually causes a full barrier. Presumably, this is what Unsafe.compareAndSwapInt()
generates under the hood as well. I am struggling to see why that is the case: with MESI protocol, after you updated the cache line, could the CPU simply invalidate just that cache line on other cores, rather than draining ALL store/load buffers of the core which performed CAS? Seems rather wasteful to me...
Asked
Active
Viewed 1,210 times
7

Bober02
- 15,034
- 31
- 92
- 178
-
With a full barrier, you would actually flush all your missed prediction changes, instead of one cache line, so wouldn't it be worse with the full barrier? But obviously I am missing sth here :) – Bober02 Jul 13 '17 at 12:49
-
[Compare-and-swap](https://en.wikipedia.org/wiki/Compare-and-swap) on Wikipedia covers this, *It compares the contents of a memory location to a given value and, only if they are the same, modifies the contents of that memory location to a new given value. This is done as a single atomic operation. The atomicity guarantees that the new value is calculated based on up-to-date information; if the value had been updated by another thread in the meantime, the write would fail.* Without a full barrier it might be interrupted (or otherwise updated) and that could invalidate atomicitiy. – Elliott Frisch Jul 14 '17 at 02:27
1 Answers
1
Your answer as far as I can see is in the comments - MESI updates caches, not Store/Load buffers
. But lock LOCK CMPXCHG
says: locked operations serialize all outstanding load and store operation
- this is why it needs to drain the Store/Load buffer from this CPU (and not others as detailed here).
So the current CPU has to perform the atomic operation on the most recent value - that could reside in Store/Load buffers, that's why a fence is needed there to actually drain that.

Eugene
- 117,005
- 15
- 201
- 306
-
I get ya! we need to gully sync up we swap the cache line, cool! One more thing - O the full mem barrier actually causes other CPU's store buffers to be flushed? Or only the one the thread is operating on? In other words, I was under the impression that pending changes from other CPU's are in my CPU's load buffer, so mem barrier just drains those – Bober02 Aug 09 '17 at 20:41
-
@Bober02 I've actually update the answer to make a bit more clear - as per my understanding. – Eugene Aug 10 '17 at 11:59