Suppose there's a multithreaded application where a single thread is inserting elements into a circular linked list while a number of working threads are walking through this list carring out the actual processing.
Say the node type is similar to this:
struct Node
{
// ...
std::atomic< Node * > next;
};
And in the method that perform insertion, there's the following snippet:
auto newNode = new Node( ); // (A)
newNode->next.store( previousNode->next.load( std:memory_order_relaxed ) ,
std::memory_order_relaxed ); // (B)
previousNode->next.store( newNode , std::memory_order_relaxed ); // (C)
where previousNode
has already been determined to be previous to newNode
in the list.
The worker threads walk through the list in manner similar to this:
// ...
while ( true )
{
ProcessNode( * currentNode );
currentNode = currentNode.next.load( std::memory_order_relaxed );
}
There's no problem that a node just created in line (A) be skipped by the worker threads until its previous node is updated in (C).
Is there any issue with such a design? I'm concerned that in the assembly level the code genereated for (B) and (C) could be something like this:
LOAD( R1 , previousNode->next ) // (1) loads previousNode->next into register R1
WRITE( newNode->next , R1 ) // (2) writes R1 to newNode->next
WRITE( previousNode->next , newNode ) // (3) writes newNode to previousNode->next
And then some optimization could reorder it to:
LOAD( R1 , previousNode->next ) // (1)
WRITE( previousNode->next , newNode ) // (3)
WRITE( newNode->next , R1 ) // (2)
and that can break the worker thread, for it can now access newNode
before its next
member is initialized.
Is this a legitimate concern? What the standard says about this?