I also come up with another solution that based on SeqLock.
After knowing that what I tried to achieve is essentially tear-detection, I rewrite it using a SeqLock template. I still define my three variables a, b, c
as _Atomic uint32_t
since I also want to modify them in thread_low_priority
using atomic_fetch_*
.
On ARMv7-M archiecture RMW atomic operations are implement using ldrex/strex
. The compiler will issue a loop to check whether strex
success or not. In my case, it could be a problem when using RMW operations because thread_high_priority
needs to be fast and run uninterruptedly. I currently don't know if there is a case where strex
always failed in the thread_high_priority
context that could cause deadlock.
_Atomic uint32_t a, b, c;
atomic_uint seqcount = 0;
void thread_high_priority(void)
{
uint32_t _a, _b, _c;
uint orig_cnt = atomic_load_explicit(&seqcount, memory_order_relaxed);
atomic_store_explicit(&seqcount, orig_cnt + 1, memory_order_relaxed);
atomic_thread_fence(memory_order_release);
_a = atomic_load_explicit(&a, memory_order_relaxed);
_b = atomic_load_explicit(&b, memory_order_relaxed);
_c = atomic_load_explicit(&c, memory_order_relaxed);
atomic_store_explicit(&a, _a - 1, memory_order_relaxed);
atomic_store_explicit(&b, _b + 1, memory_order_relaxed);
atomic_store_explicit(&c, _c - 1, memory_order_relaxed);
atomic_store_explicit(&seqcount, orig_cnt + 2, memory_order_release);
}
void thread_low_priority(void)
{
uint32_t _a, _b, _c;
uint c0, c1;
do {
c0 = atomic_load_explicit(&seqcount, memory_order_acquire);
_a = atomic_load_explicit(&a, memory_order_relaxed);
_b = atomic_load_explicit(&b, memory_order_relaxed);
_c = atomic_load_explicit(&c, memory_order_relaxed);
c1 = atomic_load_explicit(&seqcount, memory_order_acquire);
} while (c0 & 1 || c0 != c1);
}
Edit: Again after checking the output from compiler, I slightly modify my code in thread_high_priority
. Compile using ARM gcc 10.3.1 (2021.10 none) with compilation flag -O1 -mcpu=cortex-m3 -std=gnu18 -mthumb
.
In my original code, dmb ish
is issued before the store as shown below.
atomic_store_explicit(&seqcount, orig_cnt + 1, memory_order_release);
--->
adds r1, r2, #1
dmb ish
str r1, [r3]
After I separate the memory barrier from store, dmb ish
is issued after store, so that the update of seqcount
is visible before updating a, b, c
.
atomic_store_explicit(&seqcount, orig_cnt + 1, memory_order_relaxed);
atomic_thread_fence(memory_order_release);
-->
adds r1, r2, #1
str r1, [r3]
dmb ish