0

I am new to std::atomic in c++ and trying to understand the implementation of compare and exchange operations under ARM processors.I am using gcc on linux.

When i look into the assembly code

    mcr p15, 0, r0, c7, c10, 5
.L41:
    ldrexb  r3, [r2]
    cmp r3, r1
    bne .L42
    strexb  ip, r0, [r2]
    cmp ip, #0
    bne .L41
.L42:
    mcr p15, 0, r0, c7, c10, 5

My understanding is

  1. it takes multiple instructions to do compare and exchange.
  2. ldrex marks the memory location as exclusive and reads the data.
  3. strex stores the data and clears the exclusive flag for that location.

My questions are

  1. does ldrex mark the Virtual addr. as exclusive or the physical address?

  2. If Process P1 marks the virtual address as exclusive and a process switch occurs to P2, will that virtual addr. be accessible in P2? what will happen if P2 also execute an ldrex on the same address.

  3. If Process P1 marks the physical address as exclusive and a process switch occurs, when P1 resumes isn't there a chance that the data now resides in a different location in physical memory due to paging.

I am trying to understand this because, i want to do a compare and exchange on a shared memory location accessed by multiple processes.

My c++ function looks like

std::atomic<bool> *flag;  
flag = (std::atomic<bool> *) (shm_ptr);  
bool temp = false ;  
while(!std::atomic_compare_exchange_strong((flag),&temp,true))  
{  
std::this_thread::yield();  
}  
// update shared memory  
std::atomic_store((flag), false);
aravind b
  • 1
  • 3
  • 1
    Can you provide a little C++ example of what you've done? – hellow Sep 26 '18 at 08:30
  • Should `std::atomic_store((flag), false);` be atomic? The flag is 'true' after lock is taken. It is not needed to be released atomically, due to other threads never could execute `std::atomic_compare_exchange_strong()` successfully until 'false' value is written back. – user3124812 Jan 14 '19 at 01:38
  • @user3124812 std::atomic_store((flag), false) ensures that the stored value is seen by all cores. With a non atomic store, the stored value can be cached and written to the memory at later point and will not be visible to other cores. – aravind b Jan 17 '19 at 13:00

1 Answers1

1

Yes, it's safe to use lock-free std::atomic<T> on shared memory mapped by different processes, on all mainstream C++ implementations for ARM.

But non-lock-free atomics won't work, because different processes won't share the same table of locks.


An interrupt before the strex completes will cause it to fail. You don't have to worry about kernel code changing the page tables between ldrex and strex.

Resuming this code in the middle after an interrupt on the same or another CPU will mean the strex simply fails, because it's not executing as part of a "transaction" started by ldrex.


Atomicity is address-free on ARM, and on every normal mainstream system that implements C++11 lock-free atomics.

Everything still works if two threads / processes on different cores have the same physical page mapped to different virtual addresses. The C++11 standard explicitly recommends that implementations work this way for lock-free std::atomic<T>. (It stops short of requiring it, because then it would have to define what a process is, and functions for remapping virtual memory.)

This is nearly a duplicate of Are lock-free atomics address-free in practice?. See that for quotes from the standard and more details.


Modern computer systems ensure that their caches don't have aliasing homonym / synonym problems, because that would cause coherency problems in general, not just for atomic RMWs. Sometimes this requires cooperation from the OS kernel (e.g. page coloring if one cache index bit comes from the page number instead of just the offset-within-a-page part of the address), but in general caches behave as physical.

(Some early CPUs, like early MIPS, did sometimes use virtually-addressed L1 data caches, but that's not done on systems that can support multiple CPUs, AFAIK.)

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • Thanks for clarifying. I didnt know that the exclusive tag would be cleared in case of an interrupt/process switch. I was confused because the ARM documentation for LDREX/STREX describes the implementation as The LDREXB operation is as follows: – aravind b Sep 27 '18 at 06:25
  • If ConditionPassed(cond) then processor_id = ExecutingProcessor() Rd = Memory[Rn,1] physical_address = TLB(Rn) If NonCachable(Rn) == 1 then MarkExclusiveGlobal(physical_address, processor_id, 1) MarkExclusiveLocal(physical_address,processor_id, 1) and a Note: The address used in a STREX instruction must be the same as the address in the most recently executed LDREX instruction. – aravind b Sep 27 '18 at 06:26
  • 1
    The OS will do a `clrex` on a context switch/interrupt event. In these infrequent events, the ldrex/strex may loop. However, important to multi-CPU systems, it will never lock the bus. An non-looping atomic (like `SWP`) would need to lock the bus and **always** block all system CPUs from accessing memory. If you try to do this with some hardware device (DMA) involved, you could have issues. – artless noise Sep 27 '18 at 14:53
  • @artlessnoise: Normally the extra loads/stores done by the interrupt handler will be sufficient to fail the LL/SC transaction, but thanks, I didn't know about `clrex`. That makes sense just in case. – Peter Cordes Sep 28 '18 at 01:32
  • @artlessnoise: `swp` on an aligned cacheable location can just lock that cache line in M state in the L1d of the core running the instruction, like x86 CPUs do because all their atomic RMW operations are "strong", not looping. The downside is potential interrupt latency, not blocking other cores, unless you have a multi-core with only shared caches, or no cache. Multiple cores can `swp` different cache lines at the same time, like on x86 CPUs. – Peter Cordes Sep 28 '18 at 01:33
  • The ARM `swp` on many ARM CPUs will lock the entire bus. It is part of the bus protocol. It is possible to implement a non-looping atomic like you say, but I am unaware of ARM cpus that do this. Can you give a reference? – artless noise Sep 28 '18 at 15:12
  • @artlessnoise: oh. Maybe I'm mistaken, I just assumed they wouldn't because that's horrible. No wonder they deprecated it. I shouldn't have asserted my claims as facts, I normally try to say when I'm basing things on assumptions. >. – Peter Cordes Sep 28 '18 at 15:18