1

I am trying to implement the following functionality:

  • Atomic and lock-free write or read-modify-write of a data type with arbitrary size (in my case usually a float/int vector with up to 6 elements).
  • Atomic read from above data type that doesn't block the writing thread. The read operation may be blocked by the write operation.

Usecase: I am trying to write the software for a CNC machine. The step pulses for the motors are generated in software in a realtime thread. This realtime thread is constantly updating a variable which holds the current position of the axis. Multiple other non-realtime threads may read that variable, e.g. to display the current position.

Question 1: Is there a standard/accepted solution or pattern for this kind of problem?

I came up with the following idea: use an std::atomic<uint64_t> to protect the data and track weather a thread is currently writing (by checking the last bit) or has written since the read started (by incrementing the value on a write).

template <class DATA, class FN>
void read_modify_write(DATA& data, std::atomic<uint64_t>& protector, FN fn)
{
  auto old_protector_value = protector.load();
  do
  {
    // wait until no other thread is writing
    while(old_protector_value % 2 != 0)
      old_protector_value = protector.load();

    // try to acquire write privileges
  } while(!protector.compare_exchange_weak(old_protector_value, old_protector_value + 1));

  // write data
  data = fn(data);

  // unlock
  protector = old_protector_value + 2;
};

template <class DATA>
DATA read(const DATA& data, std::atomic<uint64_t>& protector)
{
  while(true)
  {
    uint64_t old_protector_value = protector.load();

    // wait until no thread is writing
    while(old_protector_value % 2 != 0)
      old_protector_value = protector.load();

    // read data
    auto ret = data;

    // check if data has changed in the meantime
    if(old_protector_value == protector)
      return ret;
  }
}

Question 2: Is the above code thread-safe and fulfilling above requirements?

Question 3: Can it be improved?

(The only theoretical problem I could find is if the counter wraps around, i.e. exactly 2^63 write operations are performed during 1 read operation. I would consider this weakness acceptable if there are no better solutions.)

Thank you

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
jffmichi
  • 50
  • 5
  • I think if you're trying to implement atomic read/write on a data type of arbitrary size, then you're going to end up using locks, implicitly or explicitly. Atomic and lock-free accesses are a feature of hardware. – Anonymous1847 Oct 01 '20 at 01:57
  • @curiousguy Sure. It just can't be both atomic and lock-free. – Anonymous1847 Oct 07 '20 at 04:20
  • @curiousguy You'd need a mutex or some other synchronization device. The ability to modify a chunk of memory atomically with no extra explicit synchronization usage is a feature provided by hardware. x86, for example, allows native synchronous access (via the LOCK prefix) only for data sizes normally handled by instructions, like 1, 2, 4, 8 bytes, although on a hardware implementation level I think it synchronizes in 64-byte pages. I'm pretty sure there's not a, say, 256-byte synchronization capability in the hardware, though. – Anonymous1847 Oct 07 '20 at 04:27
  • @curiousguy Like how mpoeter's answer says, the OP's code essentially implements a software spinlock. Hence not lock-free, however atomic it may be. Additionally, it may be worth mentioning that lock-free refers only to software locks, as the hardware might itself have some sort of lower-level locks. – Anonymous1847 Oct 07 '20 at 04:29
  • @Anonymous1847 Oops I think there was an English language issue here! I read "Atomic and lock-free accesses" as "Atomic accesses, as well as lock-free accesses, (...)" so the phrase would apply to both accesses. But it meant "accesses that are BOTH atomic and lock-free"!!! My bad! – curiousguy Oct 07 '20 at 08:37
  • @Anonymous1847 "_It just can't be both atomic and lock-free_" It depends what you mean by lock free... – curiousguy Oct 07 '20 at 11:11

1 Answers1

4

Strictly speaking your code is not lock-free, because you effectively use the LSB of protector to implement a spinlock.

Your solution looks very much like a sequence lock. However, the actual read operation auto ret = data; is strictly speaking a data race. To be fair, it is simply not possible to write a fully standard compliant seqlock in C++17, for that we have to wait for C++20.

It is possible to extend the seqlock to make read operations lock-free at the cost of higher memory usage. The idea is to have multiple instances of the data (let's call them slots), and the write operation always writes to the next slot in a round robin fashion. This allows a read operation to read from the last fully written slot. Dmitry Vyukov described his approach in Improved Lockfree SeqLock. You can take a look at my seqlock implementation that is part of my xenium library. It also optionally allows lock-free read operations with a configurable number of slots (though it slightly differs to Vyukov in how a reader finds the last fully written slot).

mpoeter
  • 2,574
  • 1
  • 5
  • 12
  • Thank you very much. I did not know about "sequence locks", this was exactly the information I was missing. – jffmichi Oct 02 '20 at 03:03
  • Unfortunately it is a bit tricky to get this right. If you want to implement a seqlock yourself, I recommend to take a look at this presentation by Hans-J. Boehm: [Can Seqlocks Get Along with Programming Language Memory Models?](http://safari.ece.cmu.edu/MSPC2012/slides_posters/boehm-slides.pdf). Unfortunately I could only find a link to the slides, the link to the accompanying technical report on the HP page is dead. – mpoeter Oct 02 '20 at 09:47