memory ordering in boost example of lock-free ring buffer

Question

When I read boost atomics about an example wait-free ring buffer implementation:

https://www.boost.org/doc/libs/1_66_0/doc/html/atomic/usage_examples.html#boost_atomic.usage_examples.example_ringbuffer

I am wondering if the memory_order_acquire is necessary at

if (next_head == tail_.load(boost::memory_order_acquire))

seems memory_order_relaxed should work as well. My argument is that

 value = ring_[tail];

happens-before

tail_.store(next(tail), boost::memory_order_release)

in pop() call. so we are sure data has been read before we store in push() call as

 ring_[head] = value;

I pasted the whole boost example code below for easy reference. Thanks!

#include <boost/atomic.hpp>

 template<typename T, size_t Size>
 class ringbuffer {
 public:
 ringbuffer() : head_(0), tail_(0) {}

 bool push(const T & value)
 {
    size_t head = head_.load(boost::memory_order_relaxed);
    size_t next_head = next(head);
    if (next_head == tail_.load(boost::memory_order_acquire))

//Could tail_.load above use boost::memory_order_relaxed?

    return false;
    ring_[head] = value;
    head_.store(next_head, boost::memory_order_release);
    return true;
 }
 bool pop(T & value)
{
    size_t tail = tail_.load(boost::memory_order_relaxed);
    if (tail == head_.load(boost::memory_order_acquire))
    return false;
   value = ring_[tail];
   tail_.store(next(tail), boost::memory_order_release);
   return true;
 }
 private:
   size_t next(size_t current)
   {
      return (current + 1) % Size;
   }
  T ring_[Size];
  boost::atomic<size_t> head_, tail_;

};

Maxim Egorushkin · Answer 1 · 2018-07-11T20:53:38.317

1

One reason is that in sequence:

if(next_head == tail_.load(boost::memory_order_acquire))
    return false;
ring_[head] = value; // A non-atomic store.

memory_order_acquire ensures that the following non-atomic store does not get reordered to precede that load of tail_.

memory_order_relaxed, on the other hand, does not prevent reordering, and hence is not sufficient here.

(Assuming boost::memory_order is equivalent to std::memory_order.)

Release-Acquire ordering:

On strongly-ordered systems — x86, SPARC TSO, IBM mainframe, etc. — release-acquire ordering is automatic for the majority of operations. No additional CPU instructions are issued for this synchronization mode; only certain compiler optimizations are affected (e.g., the compiler is prohibited from moving non-atomic stores past the atomic store-release or performing non-atomic loads earlier than the atomic load-acquire). On weakly-ordered systems (ARM, Itanium, PowerPC), special CPU load or memory fence instructions are used.

edited Jul 11 '18 at 20:53

answered Jul 11 '18 at 14:58

Maxim Egorushkin

131,725
17
180
271

Make sense to me now. But seems like we are trying to just ensure the sequence order within the same thread, is there any lighter way to keep the sequence order like: asm volatile("" ::: "memory"); between the two lines above? will that also work? – Yunzhou Wu Jul 11 '18 at 17:53
@YunzhouWu `asm volatile("" ::: "memory");` is a compiler memory barrier only, it does not prevent the hardware from reorderging. Added a note for you. – Maxim Egorushkin Jul 11 '18 at 20:51
you are right. So I assume memory_order_consume will work as well. Basically there is no way to keep sequence order within the same thread without involving memory synchronization primitives? – Yunzhou Wu Jul 11 '18 at 22:11
1

@YunzhouWu: `.load(boost::memory_order_acquire)` is free on x86, except for blocking compile-time reordering. Your compiler will compile it as cheaply as possible, with no extra asm instructions on platforms where none are needed. And no, consume isn't strong enough because `ring_[head] = value;` doesn't have a data dependency on `next_head`. – Peter Cordes Jul 11 '18 at 22:53
@Peter, if memory_order_acquire is only blocking compile-time reordering, how does it prevent out of order execution on x86? any way, x86 could move read before write on different variable. – Yunzhou Wu Jul 12 '18 at 01:55
@YunzhouWu: acquire doesn't require blocking StoreLoad reordering, only LoadStore and LoadLoad to keep *later* operations after the acquire load.. http://preshing.com/20120913/acquire-and-release-semantics/ – Peter Cordes Jul 12 '18 at 02:05
1

@YunzhouWu `memory_order_acquire` prevents **both** compiler and CPU reordering. On some CPUs the latter does not require any extra CPU instructions. This is what that quote says. – Maxim Egorushkin Jul 12 '18 at 09:15

score 0 · Answer 2 · answered Dec 04 '18 at 15:02

0

As far as I can see, both tail_.load(boost::memory_order_acquire) in push() and head_.load(boost::memory_order_acquire) in pop() can be relaxed and replaced with xx.load(boost::memory_order_relaxed).

answered Dec 04 '18 at 15:02

Peng Liu

1
2

memory ordering in boost example of lock-free ring buffer

2 Answers2