1

I am looking for a processor that performs read acquire/store release with the same semantic as specified in the C11/C++11 standards.

x86 processor synchronization is much too strong so that it is impossible to test a lock-free algorithm using acquire/release semantic.

The same seems to apply to ARM processor because this architecture offers either stronger or weaker read/store synchronizations. Maybe ARMv8.3 may offer the right semantic but I believe there are no ARMv8.3 processor on the market.

On which processor or architecture should I test a lock-free algorithm using acquire-release semantic?

Oliv
  • 17,610
  • 1
  • 29
  • 72
  • The C and C++ specifications uses an abstract model for memory and CPU. There doesn't have to be an existing (old or new) processor which follows that abstract model. – Some programmer dude Aug 29 '17 at 08:48
  • 1
    PowerPC is notoriously weakly-ordered, so it might be a good ISA to test on. If possible, testing on multiple ISAs with multiple *compilers* would be good, because compile-time reordering could expose or hide problems. (This is more of an issue on x86, where every store is a release-store, and you only need a barrier for StoreLoad reordering. And actual x86 implementations often go beyond the official x86 rules with undocumented guarantees to avoid breaking important legacy code (mostly Windows).) – Peter Cordes Aug 29 '17 at 08:49
  • @PeterCordes If I target a PowerPC, and compiler a code using acquire/release atomic operation, is not the compiler going to generate instructions that causes much stronger memory synchronisation than acquire/release atomic operations because it is impossible to generate this semantic on PowerPC? – Oliv Aug 29 '17 at 08:59
  • 2
    As you say, ARM offers even weaker semantics. So on such architecture any proper implementation of C11 atomics should be able to implement `acq_rel` semantics with reasonable optimizations that let you explore code paths that you would never hit on x86. – Jens Gustedt Aug 29 '17 at 10:01
  • 1
    @Oliv: PowerPC has separate instructions for different kinds of barriers. ARM64 has actual acquire-load and release-store instructions. (including LL/SC versions, so it can implement `mo_acq_rel`). I'd recommend testing on both if possible! (And not just a virtual machine simulator/emulator, unless it's a simulator specifically programmed to simulate memory-reordering.) IDK if DEC Alpha hardware is still available and supported by modern versions of compilers; you might end up finding compiler bugs. – Peter Cordes Sep 01 '17 at 20:47

0 Answers0