6

Is it possible to coax std::atomic to output CMPXCHG16B for types where I'm not interested in using the atomic interlocked operations on Windows x64, or do I just have to suck it up and do the atomic operations by hand? I can get GCC/Clang to do this on Linux so I suspect its just an issue with the Microsoft Standard Library.

struct Byte16
{
    int64_t a, b;
};

std::atomic<Byte16> atm;
Byte16 a = { 1, 2 };
atm.compare_exchange_strong(...); // This has a lock on Windows, not on Linux version of code
Anders
  • 97,548
  • 12
  • 110
  • 164
BlamKiwi
  • 2,093
  • 2
  • 19
  • 30
  • Processor compatibility? Some older CPUs don't have that instruction: maybe you need to compile for a narrower target? – Yakk - Adam Nevraumont Jan 26 '15 at 02:38
  • @Yakk I've considered that, however I'm having trouble identifying what flags to pass into ICC. . – BlamKiwi Jan 26 '15 at 02:54
  • Tried [these options](https://software.intel.com/en-us/articles/performance-tools-for-software-developers-intel-compiler-options-for-sse-generation-and-processor-specific-optimizations)? Start with the most "powerful" and see if it solves your problem? I am just guessing and googling here. – Yakk - Adam Nevraumont Jan 26 '15 at 02:58
  • @Yakk Both /QxHost and /arch:AVX still fails to produce a lock-free std::atomic. – BlamKiwi Jan 26 '15 at 03:09
  • 3
    It looks like it is indeed a Microsoft Standard Library issue. Going through the headers there are only specializations up to 8Byte atomics. – BlamKiwi Jan 26 '15 at 03:38
  • 2
    For future readers, [here's pure C++ `std::atomic` code that compiles to `lock cmpxchg16b`](http://stackoverflow.com/questions/38984153/implement-aba-counter-with-c11-cas/38991835#38991835) with gcc or clang with `-mcx16` to enable use of that instruction (which is unfortunately not baseline for x86-64: missing from the earliest CPUs). – Peter Cordes Dec 16 '16 at 19:09
  • You say you know how to do this on GCC, @BlamKiwi. What is that way? On gcc 9.2.1, -mcx16 does _not_ work: "The compiler uses this instruction to implement __sync Builtins. However, for __atomic Builtins operating on 128-bit integers, a library call is always used." – Swiss Frank May 08 '20 at 09:44
  • @SwissFrank Is this an error or a warning? I haven't used GCC for C++ in a couple of years, but the libatomic library is intended behaviour. – BlamKiwi May 12 '20 at 09:33

1 Answers1

-2

use __m128 in windows

#include <emmintrin.h>
//...
  std::atomic<__m128> a, c;
  __m128 b;
  a.compare_exchange_strong(b,c);
///...
DU Jiaen
  • 955
  • 6
  • 14