3

I have below a SPSC queue for my logger.

It is certainly not a general-use SPSC lock-free queue.

However, given a bunch of assumptions around how it will be used, target architecture etc, and a number of acceptable tradeoffs, which I go into detail below, my questions is basically, is it safe / does it work?

  • It will only be used on x86_64 architecture, so writes to uint16_t will be atomic.
  • Only the producer updates the tail.
  • Only the consumer updates the head.
  • If the producer reads an old value of head, it will look like there is less space in the queue than reality, which is an acceptable limitation in the context in which is is used.
  • If the consumer reads an old value of tail, it will look like there is less data waiting in the queue than reality, again an acceptable limitation.

The limitations above are acceptable because:

  • the consumer may not get the latest tail immediately, but eventually the latest tail will arrive, and queued data will be logged.
  • the producer may not get the latest head immediately, so the queue will look more full than it really is. In our load testing we have found the amount we log vs the size of the queue, and the speed at which the logger drains the queue, this limitation has no effect - there is always space in the queue.

A final point, the use of volatile is necessary to prevent the variable which each thread only reads from being optimised out.

My questions:

  • Is this logic correct?
  • Is the queue thread safe?
  • Is volatile sufficient?
  • Is volatile necessary?

My queue:

class LogBuffer
{
public:
    bool is_empty() const { return head_ == tail_; }
    bool is_full()  const { return uint16_t(tail_ + 1) == head_; }
    LogLine& head()       { return log_buffer_[head_]; }
    LogLine& tail()       { return log_buffer_[tail_]; }
    void advance_head()   { ++head_; }
    void advance_hail()   { ++tail_; }
private:
    volatile uint16_t tail_ = 0;     // write position
    LogLine log_buffer_[0xffff + 1]; // relies on the uint16_t overflowing
    volatile uint16_t head_ = 0;     // read position
};
Steve Lorimer
  • 27,059
  • 17
  • 118
  • 213
  • Applying common sense: if this were enough, why would you not see many lockfree libraries implement their SPSC queues in this way :) – sehe Nov 26 '14 at 00:40
  • 2
    @sehe because of the tradeoffs I've listed - using atomics guarantees that a read in the consumer will see the latest write in the producer. The tradeoff that the latest write may not be seen for several reads is typically not acceptable in a library implementation – Steve Lorimer Nov 26 '14 at 00:51

1 Answers1

3

Is this logic correct?

Yes.

Is the queue thread safe?

No.

Is volatile sufficient? Is volatile necessary?

No, to both. Volatile is not a magic keyword that makes any variable threadsafe. You still need to use atomic variables or memory barriers for the indexes to ensure memory ordering is correct when you produce or consume an item.

To be more specific, after you produce or consume an item for your queue you need to issue a memory barrier to guarantee that other threads will see the changes. Many atomic libraries will do this for you when you update an atomic variable.

As an aside, use "was_empty" instead of "is_empty" to be clear about what it does. The result of this call is one instance in time which may have changed by the time you act on its value.

BlamKiwi
  • 2,093
  • 2
  • 19
  • 30
  • 2
    Given the tradeoffs that I listed, in the use that I have described, why is it not safe? Both `head` and `tail` only ever advance, `head` is only written to by the consumer, `tail` only written to by the producer. A write to `head` reordered before a write to `tail` only makes the queue look less full than it really is - the logger will log less than is currently queued, but this is acceptable because the write to `tail` will eventually by seen by the consumer. A reorder the other way around will make the producer see the queue with less space than is actually available, again acceptable – Steve Lorimer Nov 26 '14 at 00:55
  • @SteveLorimer Atomics takes into account more than just if a value is "stale". It is also prevents other issues like threads reading partially updated variables. Depending on your target architecture and runtime alignment requirements it is totally possible to read a "bad index" and leave your program in an undefined state. – BlamKiwi Nov 26 '14 at 01:01
  • 2
    Ok, fair enough, but on `x86_64` a write to a `uint16_t` will be atomic - so given all that, is it still unsafe? – Steve Lorimer Nov 26 '14 at 02:01
  • 2
    @SteveLorimer x86_64 will only guarantee atomicity if uint16_t is aligned. Nothing in your code is telling the compiler it *cannot* put your int16 over the alignment boundary on the stack, this has an even weaker guarantee on the heap. Use proper atomic semantics and remove the guesswork for you and the compiler. – BlamKiwi Nov 26 '14 at 02:22
  • 3
    Thanks - that's the kind of input that I was looking for! – Steve Lorimer Nov 26 '14 at 02:24
  • 1
    The structure will be word aligned in our use btw - but you're right in that it should be enforced – Steve Lorimer Nov 26 '14 at 02:25
  • I just consider things like this to be part of making code that is obviously correct instead of just looking correct. If you don't have access to C++11 Mintomic is a good little Atomic library. – BlamKiwi Nov 26 '14 at 02:42