Does the correct use of "volatile" still (always) result in a program with undefined interaction with that data?

Question

Online examples of correct use of the volatile keyword appear to be like so:

void Foo (volatile SomethingExternal * x, int data_update)
{
  while (x->busy);

  x->data = data_update;
}

But it seems that if the data that x points to is genuinely volatile, then a context switch may occur between exiting the while loop and writing to the data, so if it's important that the busy flag is false when we access it then isn't this code unsafe?

What is volatile, the pointer or the object being pointed to? — wildplasser, Jul 29 '20 at 23:45
"unsafe" is conditional. You may be on a baremetal platform where a context switch cannot occur here, in which case you still need `volatile` to ensure that accesses are not elided, coalesced, or optimized out. — nanofarad, Jul 29 '20 at 23:46
@nanofarad, okay, so the volatile keyword is really for when context switching isn't allowed (or when we just want the more "recent" information, but don't care about defined behaviour). Is that right? — Elliott, Jul 29 '20 at 23:49
Re "*so the volatile keyword is really for when context switching isn't allowed*", No, that's irrelevant. The hardware could change `x->busy` even without a context switch. Presumably, the hardware in this example doesn't set `x->busy` until you send a command. — ikegami, Jul 29 '20 at 23:51
If you do have a multitasking kernel, it is usual to signal a semaphore instead of using volatile flags, (though both can be used - the I/O thread polling several interrupt flags when it becomes ready after the sema signal). — Martin James, Jul 30 '20 at 05:18

nanofarad · Accepted Answer · 2020-07-30T00:59:23.880

1

This is not quite true. There are constructs which, by design, are correct when implemented with volatile operations. From the standard as quoted in [this answer]:

The observable behavior of the abstract machine is its sequence of reads and writes to volatile data and calls to library I/O functions.

This gives us guarantees that all volatile data will be read and written as requested, without reordering with respect to the current thread.

As an example of a structure which is correct even with context-switching, the low-level acquisition of a mutex can be implemented using Dekker's Algorithm. This algorithm does not require an atomic compare-and-swap operation, but it does require the use of volatile-qualified memory. Since volatile operations of one thread are not reordered as seen by anyone (including external threads), the algorithm's correctness holds (the proof requires that operations not be reordered). Likewise, because volatile reads always read from actual memory and not from a cached value, the algorithm can make progress when the lock is made available.

It is an exercise to show to the reader that this algorithm can be used to construct, for example, a safe locking idiom.

Another example of (safe) use of volatile variables is the code given in your question, when executed on a single-threaded processor without context switches (e.g. a microcontroller with interrupts disabled) with x pointing into the memory mapping of an external device. This assumes that the code is actually correct for the device's intended use (i.e. as soon as busy is deasserted, a single write to the data register will initiate whatever task is required of it).

Volatile reads ensure that your program makes progress when the device is no longer busy (liveness), because the compiler cannot simply coalesce the loop into a single memory read followed by an infinite loop taken if the device was busy.

edited Jul 30 '20 at 00:59

answered Jul 30 '20 at 00:00

nanofarad

40,330
4
86
117

Aha! So you use volatile on shared data that can inform you that "it's your turn [the turn of a specific process/thread]". This way you know that all other processes/threads will not edit it because they will wait for their turn. In the case of the single-threaded processor without context switching its the same: hardware sets a value to say that it's your turn (the processors) to use those registers. – Elliott Jul 30 '20 at 00:16
1

@Elliott More or less. Just remember that you likely won't see Dekker's algorithm in practice on machines that have atomic instructions--atomic test-and-set lets you construct a more efficient mutex (but it requires a stronger guarantee than `volatile` provides) – nanofarad Jul 30 '20 at 00:21
Re “Since volatile operations are not reordered within a thread, the algorithm's correctness holds (the proof requires that operations not be reordered)”: Note that the proof requires not just that operations not be reordered within a thread but that they are not reordered between threads: Thread 1 must see certain of thread 0’s memory writes in the correct order and vice-versa. As the article you link to notes, on machines that reorder memory accesses, barriers are required. – Eric Postpischil Jul 30 '20 at 00:29
@nanofarad, so for memory shared between threads or processes you would only use volatile if you were unfortunate enough not to have access to test-and-set instructions? – Elliott Jul 30 '20 at 00:35
@Elliott Not quite, I wouldn't even use volatile. I'd use a normal variable and a suitable mutex that provides the correct memory barriers, preferring to keep any atomic and volatile stuff encapsulated within a narrow set of code (i.e. within my mutex, semaphore, condition variable, and other concurrency primitives I want to use). – nanofarad Jul 30 '20 at 01:00
@EricPostpischil Thanks, I've made that wording more clear. – nanofarad Jul 30 '20 at 01:00

score 0 · Answer 2 · answered Jul 30 '20 at 00:04

In the example you link to, the model is of some device that is accessed with volatile objects. There is no other thread or process interacting with the device: Once the device finishes its task and becomes not busy, it remains not busy until you give it a new command. No other thread or process will make it busy; you own the device and have exclusive access. The memory needs to be marked volatile so that the compiler will perform an actual read when the C code checks x->busy and will perform an actual write when the C code writes x->data.

You are correct that a context switch could occur between testing x->busy and writing x->data. This would be a bug if there were another process or thread that were accessing the device. But that is not what this code is for.

Does the correct use of "volatile" still (always) result in a program with undefined interaction with that data?

2 Answers2