In the following example are 4 versions to atomically increment (or use other form of rmw-statements) on a variable a1
or a2
(depending on the version). The variable a1
or a2
may be shared with some form of ISR.
The question is according to a Cortex-M4 (STM32G431). The compiler is g++
(see below).
Version 1:
As I understand, entering a ISR issues a clrex
automatically, so that the first strex
always fails, if the sequence is interrupted. Correct? And therefore the ISR does not have to use ldrex
/strex
also? The implicit clrex
works as sort of global memory clobber: would be possible to limit the clobber to a2
in the ISR?
Version 2:
Do the __disable_irq()
/enable_irq()
contain a compile-time barrier? So, are the explicit barries unneccessary? Would it be better (performance) to disable only the IRQ that could modify the variable a2
?
Comparing Version 1 und 2: If no IRQ hits the sequence, both should use the same number of CPU-cycles, but if any IRQ arises, Version 1 uses more cycles?
Version 3:
This produces additional dmb
barrier instructions. But as I understand, these dmb
are not neccessary on single-core M4?
Version 4:
Does not generate the dmb
as in Version 3. Should this be the preferred way on single-core?
#include <stm32g4xx.h>
#include <atomic>
namespace {
std::atomic_uint32_t a1;
uint32_t a2;
}
int main(){
while(true) {
// 1
uint32_t val;
do {
val = __LDREXW(&a2);
val += 1;
} while ((__STREXW(val, &a2)) != 0U);
// 2
__disable_irq();
std::atomic_signal_fence(std::memory_order_seq_cst); // compile-time barrier really neccessary?
++a2;
std::atomic_signal_fence(std::memory_order_seq_cst); // compile-time barrier really neccessary?
__enable_irq();
// 3
std::atomic_fetch_add(&a1, 1);
// 4
std::atomic_signal_fence(std::memory_order_seq_cst); // compile-time barrier
std::atomic_fetch_add_explicit(&a1, 1, std::memory_order_relaxed);
std::atomic_signal_fence(std::memory_order_seq_cst);
}
}
Compile the above with
arm-none-eabi-g++ -I../../../STM32CubeG4/Drivers/CMSIS/Core/Include -I../../../STM32CubeG4/Drivers/CMSIS/Device/ST/STM32G4xx/Include -I../../../STM32CubeG4/Drivers/STM32G4xx_HAL_Driver/Inc -DSTM32G431xx -O3 -std=c++23 -fno-exceptions -fno-unwind-tables -fno-rtti -fno-threadsafe-statics -funsigned-char -funsigned-bitfields -fshort-enums -ffunction-sections -fdata-sections -fconcepts -ftemplate-depth=2048 -fstrict-aliasing -Wstrict-aliasing=1 -Wall -Wextra -I. -mthumb -mcpu=cortex-m4 -mfpu=fpv4-sp-d16 -mfloat-abi=hard -fverbose-asm -Wa,-adhln -S -o test99.s test.cc