I'm using C (more exactly: C11 with gcc) for developing some low-latency software for x64 (more exactly: Intel CPUs only). I don't care about portability or any other architecture in general.
I know that volatile
is in general not the first choice for data synchronization. However, those three facts seem to be true:
volatile
enforces writing data to memory and as well reading from memory (=so it's not allowed to "cache" the value in a register and it also implies that some optimizations cannot be done by the compiler)volatile
accesses must not be reordered by the compiler- 4 byte (or even 8 byte) values are always atomically written on x64 (same is true for reading)
Now I have this code:
typedef struct {
double some_data;
double more_data;
char even_more_data[123];
} Data;
static volatile Data data;
static volatile int data_ready = 0;
void thread1()
{
while (true) {
while (data_ready) ;
const Data x = f(...); // prepare some data
data = x; // write it
data_ready = 1; // signal that the data is ready
}
}
void thread2()
{
while (true) {
while (!data_ready) ;
const Data x = data; // copy data
data_ready = 0; // signal that data is copied
g(x); // process data
}
}
thread1
is a producer of Data
and thread2
is a consumer of Data
. Note that is used those facts:
data
is written beforedata_ready
. So whenthread2
readsdata_ready
and it's 1, then we know thatdata
is also available (guarantee for the ordering ofvolatile
)thread2
first reads and storesdata
and then setsdata_ready
to 0, sothread1
can again produce some data and store it.data_ready
cannot have a weird state, because reading and writing anint
(with 4 bytes) is automatically atomic on x64
This way was the fastest option I've finally had. Note that both threads are pinned to cores (which are isolated). They are busy polling on data_ready
, because it's important for me to process the data as fast as possible.
Atomics and mutexes were slower, so I used this implementation.
My question is finally if it's possible that this does not behave as I expect it? I cannot find anything wrong in the shown logic, but I know that volatile
is a tricky beast.
Thanks a lot