Swapping buffers in single-writer-multiple-reader threads

Question

The Story

There is a writer thread, periodically gathering data from somewhere (in real-time, but that doesn't matter much in the question). There are many readers then reading from these data. The usual solution for this is with two reader-writer's lock and two buffers like this:

Writer (case 1):
acquire lock 0                        
loop
    write to current buffer
    acquire other lock
    free this lock
    swap buffers
    wait for next period

Or

Writer (case 2):
acquire lock 0                        
loop
    acquire other lock
    free this lock
    swap buffers
    write to current buffer
    wait for next period

The Problem

In both methods, if the acquire other lock operation fails, no swap is done and writer would overwrite its previous data (because writer is real-time, it can't wait for readers) So in this case, all readers would lose that frame of data.

This is not such a big deal though, the readers are my own code and they are short, so with double buffer, this problem is solved, and if there was a problem I could make it triple buffer (or more).

The problem is the delay that I want to minimize. Imagine case 1:

writer writes to buffer0                reader is reading buffer1
writer can't acquire lock1              because reader is still reading buffer1
|                                       |
|                                       reader finishes reading,
| (writer waiting for next period)      <- **this point**
|
|
writer wakes up, and again writes to buffer0

At **this point**, other readers in theory could have read data of buffer0 if only the writer could do the swap after the reader finishes instead of waiting for its next period. What happened in this case is that just because one reader was a bit late, all readers missed one frame of data, while the problem could have been totally avoided.

Case 2 is similar:

writer writes to buffer0                reader is idle
|                                       |
|                                       reader finishes reading,
| (writer waiting for next period)
|
|                                       reader starts reading buffer1
writer wakes up                         |
it can't acquire lock1                  because reader is still reading buffer1
overwrites buffer0

I tried mixing the solutions, so the writer tries swapping buffers immediately after writing, and if not possible, just after waking up in the next period. So something like this:

Writer (case 3):
acquire lock 0                        
loop
    if last buffer swap failed
        acquire other lock
        free this lock
        swap buffers
    write to current buffer
    acquire other lock
    free this lock
    swap buffers
    wait for next period

Now the problem with delay still holds:

writer writes to buffer0                reader is reading buffer1
writer can't acquire lock1              because reader is still reading buffer1
|                                       |
|                                       reader finishes reading,
| (writer waiting for next period)      <- **this point**
|
|
writer wakes up
swaps buffers
writes to buffer1

Again at **this point**, all the readers could start reading buffer0, which is a short delay after buffer0 has been written, but instead they have to wait until the next period of the writer.

The Question

The question is, how do I handle this? If I want the writer to execute precisely at desired period, it needs to wait for the period using RTAI function and I can't do it like

Writer (case 4):
acquire lock 0                        
loop
    write to current buffer
    loop a few times or until the buffer has been swapped
        sleep a little
        acquire other lock
        free this lock
        swap buffers
    wait for next period

This introduces jitter. because the "few times" could happen to become longer than the "wait for next period" so the writer might miss the start of its period.

Just to be more clear, here's what I want to happen:

writer writes to buffer0                reader is reading buffer1
|                                       |
|                                       reader finishes reading,
| (writer waiting for next period)      As soon as all readers finish reading,
|                                         the buffer is swapped
|                                       readers start reading buffer0
writer wakes up                         |
writes to buffer1

What I Found Already

I found read-copy-update which as far as I understood keeps allocating memory for buffers and frees them until the readers are done with them, which is impossible for me for many reasons. One, the threads are shared between kernel and user space. Second, with RTAI, you can't allocate memory in a real-time thread (because then your thread would be calling Linux's system calls and hence break the real-time-itivity! (Not to mention using Linux's own RCU implementation is useless due to the same reasons)

I also thought about having an extra thread that at a higher frequency tries swapping buffers, but that doesn't sound like such a good idea. First, it would itself need to synchronize with the writer, and second, well I have many of these writer-readers working in different parts in parallel and one extra thread for each writer just seems too much. One thread for all writers seems very complicated regarding synchronization with each writer.

Why does the writer need to acquire the lock for the new buffer before it drops the lock for the old one? If it doesn't manage to get the new lock in time, it basically means the readers can't keep up so you're going to end up losing data whatever you do - so throw away a frame in the writer. — richvdh, Oct 18 '11 at 11:06
@richvdh The writer needs to acquire lock before dropping the previous because if it drops the previous and the other lock cannot be taken, then the writer cannot write to either buffer which is disastrous. The writer needs to be at all time functional and not blocked. — Shahbaz, Oct 18 '11 at 12:03
It is not that the reader can't keep up, it would keep up, but it's evokations maybe misaligned with the writer, so it could happen that the reader starts reading right before the writer wants to swap buffers. In this situation, in fact the current solution is to drop the frame. Obviously, this is not desired. — Shahbaz, Oct 18 '11 at 12:05
I think I see now. As janneb says, it sounds like exactly the problem triple-buffering is supposed to solve. — richvdh, Oct 18 '11 at 13:16

score 3 · Accepted Answer · answered Nov 10 '11 at 09:27

What API are you using for reader-writer locks? Do you have a a timed lock, like pthread_rwlock_timedwrlock? If yes, I think the it's a solution to your problem, like in the following code:

void *buf[2];

void
writer ()
{
  int lock = 0, next = 1;

  write_lock (lock);
  while (1)
    {
      abs_time tm = now() + period;

      fill (buf [lock]);
      if (timed_write_lock (next, tm))
        {
          unlock (lock);
          lock = next;
          next = (next + 1) & 1;
        }
      wait_period (tm);
    }
}


void
reader ()
{
  int lock = 0;
  while (1)
    {
      reade_lock (lock);
      process (buf [lock]);
      unlock (lock);
      lock = (lock + 1) & 1;
    }
}

What happens here, is that it does not really matter for the writer whether it waits for a lock or for the next period, as long as it is sure to wake up before the next period has come. The absolute timeout ensures this.

Yes I do have timed lock. Right... right! This works! Although I need to set a margin for `tm` for possible errors, but it does what I need almost perfectly. Thanks a lot! :) — Shahbaz, Nov 10 '11 at 10:32

score 1 · Answer 2 · answered Oct 18 '11 at 13:02

1

Isn't this exactly the problem triple buffering is supposed to solve. So you have 3 buffers, lets call them write1, write2, and read. The write thread alternates between writing to write1 and write2, ensuring that they never block, and that the last complete frame is always available. Then in the read threads, at some appropriate point (say, just before or after reading a frame), the read buffer is flipped with the available write buffer.

While this would ensure that writers never block (the buffer flipping can be done very quickly just by flipping two pointers, perhaps even with a CAS atomic instead of a lock), there is still the issue of readers having to wait for other readers to finish with the read buffer before flipping. I suppose this could be solved slightly RCU-esque with a pool of read buffers where an available one can be flipped.

answered Oct 18 '11 at 13:02

janneb

36,249
2
81
97

The readers waiting for each other is not a problem. They can simply move on to the last updated buffer when they are done. That however is where triple buffer may also fail. If reader one is reading buffer1 and reader two reading buffer2, then the writer cannot swap away from buffer0. Although I have to admit this is going to be highly unlikely, I can't take the risk because the program is real-time and it can't rely on chances. – Shahbaz Oct 18 '11 at 13:28
@Shahbaz: In the triple-buffering scheme I outlined above in my first paragraph, this cannot happen since there is always two write buffers. Instead, it's the readers that will block each other. – janneb Oct 18 '11 at 14:10
Ah yes, I was thinking a similar, yet different thing. I +1ed your answer because it does help me think about a way, but it still doesn't fix the problem. – Shahbaz Oct 18 '11 at 14:19

Louis Ricci · Answer 3 · 2011-10-18T16:42:06.810

1

Use a Queue (FIFO linked list)
The real-time writer will always append (enqueue) to the end of the queue
The readers will always remove (dequeue) from the beginning of the queue
The readers will block if the queue is empty

edit to avoid dynamic allocation

I would probably use a circular queue... I would use the built in __sync atomic operations. http://gcc.gnu.org/onlinedocs/gcc-4.1.0/gcc/Atomic-Builtins.html#Atomic-Builtins

Circular queue (FIFO 2d array)
- ex: byte[][] Array = new byte[MAX_SIZE][BUFFER_SIZE];
- Start and End index pointers
Writer overwrites buffer at Array[End][]
- Writer can increment Start if it ends up looping all the way around
Reader gets buffer from Array[Start][]
- Reader blocks if Start == End

edited Oct 18 '11 at 16:42

answered Oct 18 '11 at 15:10

Louis Ricci

20,804
5
48
62

I am afraid, as I mentioned in the text of the question, this is absolutely not possible because I cannot dynamically allocate memory while in a real-time thread using RTAI. – Shahbaz Oct 18 '11 at 16:15
In this case you'd need to use a circular queue as a 2d array. I'll update my answer – Louis Ricci Oct 18 '11 at 16:23
How does a circular queue differ from his current double-buffer approach (i.e., a queue with two elements)? I do not think this answers the question... – Nemo Oct 18 '11 at 17:45
@Nemo - you can implement the circular queue to only block (synchronize) on readers (multiple). The writer (singular) would be allowed to move without blocking. If the writer every gets to a point where it's lapped the readers it would nudge the reading pointer forward (using an atomic operation) essentially culling (dropping) the oldest buffer. – Louis Ricci Oct 18 '11 at 18:16
@LastCoder, as Nemo mentioned, what you are suggesting is exactly what I have already done. It used to be double buffer, but now I extended it to multiple buffers. In general though, this method is not good because you cannot increase the number of readers without introducing problems. – Shahbaz Oct 19 '11 at 10:59

mcdowella · Answer 4 · 2011-10-18T18:31:52.540

If you don't want the writer to wait, perhaps it shouldn't acquire a lock that anybody else might hold. I would have it perform some sort of synchronisation, though, to make sure that what it writes really is written out - typically, most synchronisation calls will cause a memory flush or barrier instruction to be executed, but the details will depend on the memory model of your cpu and the implementation of your threads package.

I would have a look to see if there is any other synchronisation primitive around that fits things better, but if push comes to shove I would have the writer lock and unlock a lock that nobody else ever uses.

Readers must then be prepared to miss things now and then, and must be able to detect when they have missed stuff. I would associate a validity flag and a long sequence count with each buffer, and have the writer do something like "clear validity flag, increment sequence count, sync, write to buffer, increment sequence count, set validity flag, sync." If the reader reads a sequence count, syncs, sees the validity flag true, reads the data out, syncs, and re-reads the same sequence count, then perhaps there is some hope that it did not get garbled data.

If you are going to do this, I would test it exhaustively. It looks plausible to me, but it might not work with your particular implementation of everything from compiler to memory model.

Another idea, or a way to check this one, is to add a checksum to your buffer and write it last of all.

See also searches on lock free algorithms such as http://www.rossbencina.com/code/lockfree

To go with this, you probably want a way for the writer to signal to sleeping readers. You might be able to use Posix semaphores for this - e.g. have the reader ask the writer to call sem_post() on a particular semaphore when it reaches a given sequence number, or when a buffer becomes valid.

What I'm doing is similar to your first idea, but simpler. Imagine many buffers, the writer at each write, swaps to a buffer that no reader is reading. Locks it. Writes on it. On next swap, tells the readers the last written buffer is that one. Problem is, if all other buffers are being read by other readers, the writer can't swap buffers and rewrites the previous one, which means all readers would miss that frame of data. — Shahbaz, Oct 19 '11 at 11:03
On your second idea of waking readers up, that also I have done. But that is one of the options of the program. The thing is, if the writer writes at 100hz, signaling the reader means the reader would also read at 100hz. But it is possible that one reader needs data at a slower rate, or even non-periodic. — Shahbaz, Oct 19 '11 at 11:04
I believe the added features suggested here cope with these problems. The Writer never pauses to wait for a reader, so readers that can keep up get all the data. A reader that can't keep up with writer can skip buffers. Each reader can check sequence numbers to find out what the latest valid buffer is. If it has not dealt with it, it can go ahead. If it has dealt with it, it can ask the writer to post a semaphore when the next buffer is ready, and then wait on that semaphore. — mcdowella, Oct 19 '11 at 18:04
that's exactly what I am doing right now. It won't solve the problem though. I don't know how to explain it in written text well enough, but you can trust me. — Shahbaz, Oct 19 '11 at 19:59

score 0 · Answer 5 · answered Oct 19 '11 at 18:16

Another option is to stick with locking, but ensure that readers never hang too long holding a lock. Readers can keep the time taken holding a lock short and predictable by doing nothing else while they hold that lock but copying the data from the writer's buffer. The only problem then is that a low priority reader can be interrupted by a higher priority task halfway through a write, and the cure for that is http://en.wikipedia.org/wiki/Priority_ceiling_protocol.

Given this, if the writer thread has a high priority, the worst case work to be done per buffer is for the writer thread to fill the buffer and for each reader thread to copy the data out of that buffer to another buffer. If you can afford that in each cycle, then the writer thread and some amount of reader data copying will always be completed, while readers processing the data they have copied may or may not get their work done. If they do not, they will lag behind and will notice this when they next grab a lock and look round to see which buffer they want to copy.

FWIW, my experience with reading real time code (when required to show that the bugs are there, and not in our code) is that it is incredibly and deliberately simple-minded, very clearly laid out, and not necessarily any more efficient than it needs to be to meet its deadlines, so some apparently pointless data-copying in order to get straightforward locking to work might be a good deal, if you can afford it.

I do afford it. In fact, I do it and that's how, like you said, I keep the reader short. The readers do profile their timing also. Let me clear one thing though. With the number of readers my system may have, I can assign (statically, but set at module insmod) a large enough number of buffers to reduce the chance of buffers not swapping to near zero. My question was mainly about "how can I unlock the write-lock on the last written buffer as soon as the last reader finishes reading from the buffer the writer wants to write to?" (Sounds complicated, right? ;) (also, letting writer `wait_period`) — Shahbaz, Oct 19 '11 at 20:05

Swapping buffers in single-writer-multiple-reader threads

The Story

The Problem

The Question

What I Found Already

5 Answers5