I have the following situation:
Process 1 (on core 1):
set_nonzero_8byte_posix_shm_memory_to_zero();
run_a_function();
Process 2 (on core 2):
uint64_t v = read_that_8byte_posix_shm_memory();
if (v != 0) {
// infer the function has not run yet
}
Essentially, I wanted the set_nonzero_8byte_shm_memory_to_zero()
to wait until the store was visible to the other cores, so that on other processes (i.e. process 2) the read can make the described inference.
I thought of using an sfence
between set_nonzero...to_zero()
and run_a_function()
, but I saw in the Linux memory barriers documentation, https://www.kernel.org/doc/Documentation/memory-barriers.txt, which says
There is no guarantee that any of the memory accesses specified before a memory barrier will be complete by the completion of a memory barrier instruction; the barrier can be considered to draw a line in that CPU's access queue that accesses of the appropriate type may not cross.
Hence, my interpretation of this was that passing the sfence
(and having "completed" the set to zero), and having started the run_a_function()
, would not imply that my read on process 2 would be guaranteed to read 0 (and as such, I would think run_a_function()
has not happened yet).
I was wondering how could I get this behavior I wanted? (Would using that address as volatile
cut it, would an atomic store with sequential consistency do it, etc)?
Information about my environmnet: I am on a high-core count NUMA dual-socket machine (x86, 64-bit), however AFAIK everything is numactl'ed to stay on a particular socket.
Any help would be much appreciated, thank you!