0

The x86 LOCK instruction allows a core to lock the system bus, so it can modify memory exclusively (or it allows the cache coherency protocol to achieve the same thing). However, Intel Developer Manual 3A doesn't state LOCK flushes the store buffer.

In contrast, SFENCE and MFENCE prevent re-ordering and flush the store buffer?

So when do I need to use LOCK and when SFENCE/MFENCE?

Does LOCK concern atomicity of a single instruction, whereas SFENCE/MFENCE concern a series of instructions?

user997112
  • 29,025
  • 43
  • 182
  • 361
  • SFENCE only orders NT stores against regular stores, and only *eventually* flushes the store buffer, but not before later loads can execute. IDK why you'd even mention it in this context, except maybe on AMD CPUs where (I think) it has semantics more like MFENCE. – Peter Cordes Feb 25 '20 at 22:01
  • The only possible differences between a `lock`ed instruction and `mfence` are whether they order NT stores (yes) or weakly-ordered NT loads (in some cases no for locked). – Peter Cordes Feb 25 '20 at 22:07
  • So generally you only use mfence if you need a full barrier as part of something that isn't already an atomic RMW. And if you're doing a store at all, it can be more efficient to use `xchg` instead of `mov + mfence`. Of even to use a dummy `lock add qword [rsp], 0` instead of mfence. – Peter Cordes Feb 25 '20 at 22:20
  • @PeterCordes Given how powerful memory barriers are, I simply wasn't sure what the point of the LOCK instruction is. – user997112 Feb 25 '20 at 23:04
  • 2
    Oh, `lock` gives you atomic RMW, as well as being a full barrier. [Can num++ be atomic for 'int num'?](//stackoverflow.com/q/39393850) – Peter Cordes Feb 25 '20 at 23:08

0 Answers0