Could the side effect of atomic operation be seen immediately by other threads?

Question

In this question one replier says

Atomicity means that operation either executes fully and all it's side effects are visible, or it does not execute at all.

However, below is an example given in Concurrency in Action $Lising 5.5

#include <thread>
#include <atomic>
#include <iostream>
std::atomic<int> x(0),y(0),z(0);
std::atomic<bool> go(false);
unsigned const loop_count=10;
struct read_values
{
  int x,y,z;
};
read_values values1[loop_count];
read_values values2[loop_count];
read_values values3[loop_count];
read_values values4[loop_count];
read_values values5[loop_count];
void increment(std::atomic<int>* var_to_inc,read_values* values)
{
  while(!go)
  std::this_thread::yield();
  for(unsigned i=0;i<loop_count;++i)
  {
    values[i].x=x.load(std::memory_order_relaxed);
    values[i].y=y.load(std::memory_order_relaxed);
    values[i].z=z.load(std::memory_order_relaxed);
    var_to_inc->store(i+1,std::memory_order_relaxed);
    std::this_thread::yield();
  }
}

void read_vals(read_values* values)
{
  while(!go)
  std::this_thread::yield();
  for(unsigned i=0;i<loop_count;++i)
  {
    values[i].x=x.load(std::memory_order_relaxed);
    values[i].y=y.load(std::memory_order_relaxed);
    values[i].z=z.load(std::memory_order_relaxed);
    std::this_thread::yield();
  }
}
void print(read_values* v)
{
  for(unsigned i=0;i<loop_count;++i)
  {
    if(i)
    std::cout<<",";
    std::cout<<"("<<v[i].x<<","<<v[i].y<<","<<v[i].z<<")";
  }
  std::cout<<std::endl;
}
int main()
{
  std::thread t1(increment,&x,values1);
  std::thread t2(increment,&y,values2);
  std::thread t3(increment,&z,values3);
  std::thread t4(read_vals,values4);
  std::thread t5(read_vals,values5);
  go=true;
  t5.join();
  t4.join();
  t3.join();
  t2.join();
  t1.join();
  print(values1);
  print(values2);
  print(values3);
  print(values4);
  print(values5);
}

The sample output given by author is

(0,0,0),(1,0,0),(2,0,0),(3,0,0),(4,0,0),(5,7,0),(6,7,8),(7,9,8),(8,9,8),(9,9,10)
(0,0,0),(0,1,0),(0,2,0),(1,3,5),(8,4,5),(8,5,5),(8,6,6),(8,7,9),(10,8,9),(10,9,10)
(0,0,0),(0,0,1),(0,0,2),(0,0,3),(0,0,4),(0,0,5),(0,0,6),(0,0,7),(0,0,8),(0,0,9)
(1,3,0),(2,3,0),(2,4,1),(3,6,4),(3,9,5),(5,10,6),(5,10,8),(5,10,10),(9,10,10),(10,10,10)
(0,0,0),(0,0,0),(0,0,0),(6,3,7),(6,5,7),(7,7,7),(7,8,7),(8,8,7),(8,8,9),(8,8,9)

The output seems that the modification in one thread is not visible to other thread immediately.

And the author also says:

Thread 3 doesn’t see any of the updates to x or y; it sees only the updates it makes to z. This doesn’t stop the other threads from seeing the updates to z mixed in with the updates to x and y though.

I'm confused that why thread3 doesn't see the modification of x and y. Does that mean atomic operation's side effects are visible is not true?Does that obey the cache coherency rule, which is guaranteed by computer hardware?

The only guarantee you get is that (quote) _an implementation should make atomic stores visible to atomic loads within a reasonable amount of time_.. A read-modify-write (RMW) operation has a stronger guarantee; It will become visible 'immediately' to another RMW since they operate on the latest in the modification order. — LWimsey, Jul 30 '18 at 03:09
@LWimsey If the side effects are not visible to other thread immediately, does that obey the cache coherency rule? — choxsword, Jul 30 '18 at 03:16
It does obey cache coherency, but simply means that an architecture may delay a store before it hits the (coherent) L1 cache. During that delay, other cores will not yet see it. Also note that cache coherency is an optional feature; a platform does not have to support it. — LWimsey, Jul 30 '18 at 03:26
@LWimsey What about common home used computer? The atomic operation still could have time latency before it's visible to other thread/cpu? — choxsword, Jul 30 '18 at 03:40
The common `X86` platform may delay a store before it becomes visible to other cores. Search for 'X86 write buffer' — LWimsey, Jul 30 '18 at 03:56
@LWimsey I'm not sure what you mean by "delay a store". Since the operation is atomic, which is not splittable, why is there a single store operation? — choxsword, Jul 30 '18 at 13:22

Could the side effect of atomic operation be seen immediately by other threads?

0 Answers0

Linked