I discovered currently that GCC 10 will no more use the mov
and mfence
method and instead will use the implied lock by an xchg
. Is this sufficient by the memory model to not break any stuff when using multithreading?
As an example I tried on godbolt first with gcc 9.3 and then with gcc 10.2 was the following Code (as optimization I used -O2):
#include <stdint.h>
#include <atomic>
std::atomic_int32_t idx;
int32_t increment(void)
{
return idx = (idx + 1);
}
The results were the following:
GCC 9.3:
increment():
mov eax, DWORD PTR idx[rip]
add eax, 1
mov DWORD PTR idx[rip], eax
mfence
ret
idx:
.zero 4
GCC 10.2:
increment():
mov eax, DWORD PTR idx[rip]
add eax, 1
mov edx, eax
xchg edx, DWORD PTR idx[rip]
ret
idx:
.zero 4
Could someone enlight me or just point me to the right point in the programming manual.
With best regards
Edit: Ok the part with the memory model is answered by the two mentioned threads.
But the other question was: Why it changed now with gcc 10? The issues mentioned about skylake etc. are also a few days old.