This is not quite true. There are constructs which, by design, are correct when implemented with volatile operations. From the standard as quoted in [this answer]:
The observable behavior of the abstract machine is its sequence of reads and writes to volatile data and calls to library I/O functions.
This gives us guarantees that all volatile data will be read and written as requested, without reordering with respect to the current thread.
As an example of a structure which is correct even with context-switching, the low-level acquisition of a mutex can be implemented using Dekker's Algorithm. This algorithm does not require an atomic compare-and-swap operation, but it does require the use of volatile-qualified memory. Since volatile operations of one thread are not reordered as seen by anyone (including external threads), the algorithm's correctness holds (the proof requires that operations not be reordered). Likewise, because volatile reads always read from actual memory and not from a cached value, the algorithm can make progress when the lock is made available.
It is an exercise to show to the reader that this algorithm can be used to construct, for example, a safe locking idiom.
Another example of (safe) use of volatile variables is the code given in your question, when executed on a single-threaded processor without context switches (e.g. a microcontroller with interrupts disabled) with x
pointing into the memory mapping of an external device. This assumes that the code is actually correct for the device's intended use (i.e. as soon as busy is deasserted, a single write to the data register will initiate whatever task is required of it).
Volatile reads ensure that your program makes progress when the device is no longer busy (liveness), because the compiler cannot simply coalesce the loop into a single memory read followed by an infinite loop taken if the device was busy.