3

On x86-64 CPUs (either Intel or AMD), is the "movnti" instruction that writes 4/8 bytes to a 32/64-bit aligned address atomic?

2 Answers2

4

Yes, movnti is atomic on naturally-aligned addresses, just like all other naturally-aligned 8/16/32/64b stores (and loads) on x86. This applies regardless of memory-type (writeback, write-combining, uncacheable, etc.) See that link for the wording of the guarantees in Intel's x86 manual.

Note that atomicity is separate from memory ordering. Normal x86 stores are release-store operations, but movnt stores are "relaxed".

Fun fact: 32-bit code can use x87 (fild/fistp) or SSE/MMX movq to do atomic 64-bit loads/stores. gcc's std::atomic implementation actually does this. It's only SSE accesses larger than 8B (e.g. movaps or movntps 16B/32B/64B vector stores) that are not guaranteed atomic. (Even 16B operations are atomic are on some hardware, but there's no standard way to detect this).

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • beware that NT stores are weakly-ordered, but C/C++11 compilers implementing std::atomic assume no NT stores. So if you need it to be ordered WRT. a `release` store or `seq_cst` store from this thread, you need `sfence` (or something strong like some independent seq_cst operation = full barrier) at some point after an NT store. Fun fact: using `xchg` for a seq_cst store wouldn't need a separate `sfence` barrier. But `mov+mfence` for seq-cst could let the `mov` store reorder with the `movnt` store *before* the barrier. – Peter Cordes Aug 05 '19 at 03:20
-1

seems clearly not:

Because the WC protocol uses a weakly-ordered memory consistency model, a fencing operation such as SFENCE should be used in conjunction with MOVNTI instructions if multiple processors might use different memory types to read/write the memory location.

caskey
  • 12,305
  • 2
  • 26
  • 27
  • According to Intel's Software Developer Manual Vol 3 Chapter 8.1.1, all basic read/write operations on 4/8 bytes at 32/64-bit aligned addresses are atomic. Weakly-ordered memory consistency model only says that the write/read order of such instructions are undefined, but regarding a single instruction, is it atomic? – user2744932 Sep 04 '13 at 01:41
  • As the comment says, if multiple processors are using different memory types, you have no guarantees. – caskey Sep 04 '13 at 01:44
  • Could you please give an example? Thank you. – user2744932 Sep 04 '13 at 01:53
  • In addition, what does it mean by "different memory types"? – user2744932 Sep 04 '13 at 01:57
  • 3
    Weak ordered doesn't mean tearing which the OP probably means (and not RMW). It means stores past MOVNTI can be visible prior the store of MOVNTI that's unusual for x86 which follows TSO. If the store is aligned it would be non-tearing. SFENCE has to do w/ reordering not word tearing. – bestsss Dec 18 '13 at 07:11
  • 1
    Down voted because the answer addresses memory ordering only, not the atomicity of the instruction. The question is already answered in the comments here though, by user2744932 and bestsss: tearing can happen when the memory location isn't aligned correctly. – Carlo Wood Mar 07 '15 at 17:05