0

I am playing with atomic reads and writes currently and have hit a wall in my understanding. I understand that writing to a variable (eg via increment) has to be atomic, but I'm not sure about reading the variable.

Consider _InterlockedExchangeAdd on Windows, or __sync_add_and_fetch on Linux. I cannot find a function that atomically retrieves the value being updated. Now I've done my research before posting here and Are C++ Reads and Writes of an int Atomic? tells me that a read isn't atomic.

1) If I use the functions above, how do I atomically read the value, say if returning it from a function?

2) If I didn't want to use these functions and just wanted to lock a mutex before every write of my "atomic" variable, in the function that retrieves the current value of the variable, would I need to first lock the mutex, copy the current value, unlock the mutex then return the copy?

EDIT

I am using a compiler that doesn't have access to the atomic headers, hence have to use these APIs.

Community
  • 1
  • 1
Wad
  • 1,454
  • 1
  • 16
  • 33
  • 1
    Have you looked into using [``](http://en.cppreference.com/w/cpp/atomic/atomic)? – Cory Kramer Jul 05 '16 at 11:31
  • Sorry, I've updated the Q to show I can't use the atomic header. – Wad Jul 05 '16 at 11:36
  • If you are doing read-modify-write then the entire operation must be inside the mutex. One of the worst bugs I had to track down was someone who thought that as long as the read was in a mutex and so was the write that he could release it in between. – stark Jul 05 '16 at 11:38
  • 2
    Read-Write of primitive types are not atomic as per the standard, but it may be atomic on some platform..so its platform dependent, C++ language does not mandate that. Said that, in linux you have `atomic_read ` to read a variable atomically. Are you looking for something else ? – Arunmu Jul 05 '16 at 11:39
  • @Arunmu thanks for that, I didn't know that. What about Windows? Am I correct in assuming that that function exists because the reading isn't atomic, so a special function is needed? – Wad Jul 05 '16 at 11:48
  • @Wad Yes, reading should not be considered atomic. You may end up reading a `torned write` on that variable unless you use some kind of synchronization primitive. – Arunmu Jul 05 '16 at 11:49
  • @Wad No idea about Windows, sorry. – Arunmu Jul 05 '16 at 11:50
  • Possible duplicate (or at least related): http://stackoverflow.com/questions/2423567/c0x-atomic-implementation-in-c98-question-about-sync-synchronize – eerorika Jul 05 '16 at 12:08
  • 1
    On Windows you have `ReadPointerAcquire`, `ReadPointerRaw`, `WritePointerRelease`, `WritePointerRaw`, but only in the WDM. – Daniel Jul 06 '16 at 18:02

1 Answers1

1

You cannot find the answer because there was no one way to do it so that it is (a) fast and (b) portable. It depends on: C++ or C, compiler, compiler version, compiler settings, library, architecture... the list goes on and on.

Here is a starting point:

I happen to have a snippet of the assembler code which may explain why CAS is a reasonable alternative. This is C, i86, Microsoft compiler VS2015, Win64 target:

volatile long debug_x64_i = std::atomic_load((const std::_Atomic_long *)&my_uint32_t_var);
00000001401A6955  mov         eax,dword ptr [rbp+30h] 
00000001401A6958  xor         edi,edi 
00000001401A695A  mov         dword ptr [rbp-0Ch],eax 
    debug_x64_i = _InterlockedCompareExchange((long*)&my_uint32_t_var, 0, 0);
00000001401A695D  xor         eax,eax 
00000001401A695F  lock cmpxchg dword ptr [rbp+30h],edi 
00000001401A6964  mov         dword ptr [rbp-0Ch],eax 
    debug_x64_i = _InterlockedOr((long*)&my_uint32_t_var, 0);
00000001401A6967  prefetchw   [rbp+30h] 
00000001401A696B  mov         eax,dword ptr [rbp+30h] 
00000001401A696E  xchg        ax,ax 
00000001401A6970  mov         ecx,eax 
00000001401A6972  lock cmpxchg dword ptr [rbp+30h],ecx 
00000001401A6977  jne         foo+30h (01401A6970h) 
00000001401A6979  mov         dword ptr [rbp-0Ch],eax 

    volatile long release_x64_i = std::atomic_load((const std::_Atomic_long *)&my_uint32_t_var);
00000001401A6955  mov         eax,dword ptr [rbp+30h] 
    release_x64_i = _InterlockedCompareExchange((long*)&my_uint32_t_var, 0, 0);
00000001401A6958  mov         dword ptr [rbp-0Ch],eax 
00000001401A695B  xor         edi,edi 
00000001401A695D  mov         eax,dword ptr [rbp-0Ch] 
00000001401A6960  xor         eax,eax 
00000001401A6962  lock cmpxchg dword ptr [rbp+30h],edi 
00000001401A6967  mov         dword ptr [rbp-0Ch],eax 
    release_x64_i = _InterlockedOr((long*)&my_uint32_t_var, 0);
00000001401A696A  prefetchw   [rbp+30h] 
00000001401A696E  mov         eax,dword ptr [rbp+30h] 
00000001401A6971  mov         ecx,eax 
00000001401A6973  lock cmpxchg dword ptr [rbp+30h],ecx 
00000001401A6978  jne         foo+31h (01401A6971h) 
00000001401A697A  mov         dword ptr [rbp-0Ch],eax

Your plan (2) for using mutex is correct.

Good luck.

Sergey D
  • 655
  • 8
  • 9