2

I can use volatile for something like the following, where the value might be modified by an external function/signal/etc:

volatile int exit = 0;
while (!exit)
{
    /* something */
}

And the compiler/assembly will not cache the value. On the other hand, with the restrict keyword, I can tell the compiler that a variable has no aliases / only referenced once inside the current scope, and the compiler can try and optimize it:

void update_res (int *a , int *b, int * restrict c ) {
    * a += * c;
    * b += * c;
}

Is that a correct understanding of the two, that they are basically opposites of each other? volatile says the variable can be modified outside the current scope and restrict says it cannot? What would be an example of the assembly instructions it would emit for the most basic example using these two keywords?

David542
  • 104,438
  • 178
  • 489
  • 842
  • 1
    The restrict in `update_restrict()` prevents you (invokes UB) from calling it like `update_restrict(&foo, &foo)` ... compare `void *memcpy(void *restrict s1, const void *restrict s2, size_t n);` (UB if objects overlap) and `void *memmove(void *s1, const void *s2, size_t n);` (works with overlapping objects) – pmg Aug 29 '20 at 08:04
  • Your `update_restrict` function doesn't use it's `b` arg, and references some global(?) `c`. It's also pointless because there's only one reference to each object so chance to optimize away a store/reload, so no chance to take advantage of the `restrict` promise. – Peter Cordes Aug 29 '20 at 08:06
  • @PeterCordes yes, I updated the question, there was a typo. – David542 Aug 29 '20 at 08:08
  • @pmg I updated the question (I had a typo). What do you mean by overlapping objects in `memcpy` ? – David542 Aug 29 '20 at 08:09
  • *What would be an example of the assembly instructions it would emit for the most basic example using these two keywords?* - try it yourself on https://godbolt.org/ with gcc10 `-O3`. Seriously, put some effort in, and maybe ask about the resulting assembly if it's not what you expected. – Peter Cordes Aug 29 '20 at 08:09
  • `memcpy(text, text + 1, strlen(text)); /*UB (because of restrict)*/` vs `memmove(text, text + 1, strlen(text)); /*ok*/` – pmg Aug 29 '20 at 08:10
  • @PeterCordes oh I see, thanks for the tip. The assembly instructions were the same for me before I used the `-O3` compiler option. – David542 Aug 29 '20 at 08:12
  • 2
    In hindsight, I hope it's not surprising that making promises to the optimizer only makes any difference if you tell it to optimize. Also, seriously you need to start searching for existing answers. There's one with this *exact* example in a Q&A about the `restrict` keyword. Re-explaining stuff that's already been well-explained elsewhere isn't a good use of anyone's time or adding much value to Stack Overflow for future readers. I only answered because the contrast between volatile vs. restrict was somewhat interesting to discuss; the part asking for restrict asm was a distraction. – Peter Cordes Aug 29 '20 at 08:34

1 Answers1

5

They're not exact opposites of each other. But yes, volatile gives a hard constraint to the optimizer to not optimize away accesses to an object, while restrict is a promise / guarantee to the optimizer about aliasing, so in a broad sense they act in opposite directions in terms of freedom for the optimizer. (And of course usually only matter in optimized builds.)

restrict is totally optional, only allowing extra performance. volatile sig_atomic_t can be "needed" for communication between a signal handler and the main program, or for device drivers. For any other use, _Atomic is usually a better choice. Other than that, volatile is also not needed for correctness of normal code. (_Atomic has a similar effect, especially with current compilers which purposely don't optimize atomics.) Neither volatile nor _Atomic are needed for correctness of single-threaded code without signal handlers, regardless of how complex the series of function calls is, or any amount of globals holding pointers to other variables. The as-if rule already requires compilers to make asm that gives observable results equivalent to stepping through the C abstract machine 1 line at a time. (Memory contents is not an observable result; that's why data races on non-atomic objects are undefined behaviour.)


volatile means that every C variable read (lvalue to rvalue conversion) and write (assignment) must become an asm load and store. In practice yes that means it's safe for things that change asynchronously, like MMIO device addresses, or as a bad way to roll your own _Atomic int with memory_order_relaxed. (When to use volatile with multi threading? - basically never in C11 / C++11.)

volatile says the variable can be modified outside the current scope

It depends what you mean by that. Volatile is far stronger than that, and makes it safe for it to be modified asynchronously while inside the current scope.

It's already safe for a function called from this scope to modify a global exit var; if a function doesn't get inlined, compilers generally have to assume that every global var could have been modified, same for everything possibly reachable from global pointers (escape analysis), or from calling functions in this translation unit that modify file-scoped static variables.

And like I said, you can use it for multi-threading, but don't. C11 _Atomic is standardized and can be used to write code that compiles to the same asm, but with more guarantees about exactly what is and isn't implied. (Especially ordering wrt. other operations.)


They have no equivalent in hand-written asm because there's no optimizer between the source and machine code asm.

In C compiler output, you won't notice a difference if you compile with optimization disabled. (Well maybe a minor difference in expressions that read the same volatile multiple times.)

Compiling with optimization disabled makes bad uninteresting asm, where every object is treated much like volatile to enable consistent debugging. As Multithreading program stuck in optimized mode but runs normally in -O0 shows, the optimizations allowed by making variables plain non-volatile only get done with optimization enabled. See also this Q&A about the same issue on single-core microcontrollers with interrupts.

*What would be an example of the assembly instructions it would emit for the most basic example using these two keywords?

Try it yourself on https://godbolt.org/ with gcc10 -O3. You already have a useful test-case for restrict; it should let the compiler load *c once.

Or if you search at all, Ciro Santilli has already analyzed the exact function you're asking about back in 2015, in an answer with over 150 upvotes. I found it by searching on site:stackoverflow.com optimize restrict, as the 3rd hit.

Realistic usage of the C99 'restrict' keyword? shows your exact case, including asm output with/without restrict, and analysis / discussion of that asm.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • Note I dont think volatile means what you think it means, in other questions/answers I/we have seen that the compiler chooses if every read or write happens to ram or if it stays in a register temporarily. There are long winded debates online about the use of volatile to attempt to share the variable between say a handler and the foreground (not advised argued by many) or for say peripheral access to a control or status register (supposed to always work). YMMV No I cant recall exactly which questions demonstrated this. One was a gcc vs clang – old_timer Aug 29 '20 at 13:24
  • I almost never use use volatile but if I do, I check the disassembly to confirm the compiler did what I was hoping it would do. Hoping that every read or write accesses the ram location. – old_timer Aug 29 '20 at 13:33
  • @old_timer: I know that the Linux kernel uses a cast from `T*` to `volatile T*` in its `WRITE_ONCE` and `READ_ONCE` macros as part of rolling its own atomics, so it works in practice for that in GCC and clang for ISAs that can run Linux. And I know that the ISO C standard says volatile accesses are "evaluated strictly according to the rules of the abstract machine" which is vague enough to leave room for interpretation, but that *this* question was pretty conceptual and doesn't need to get into the engineering details of exactly when you can really use it on some compilers. – Peter Cordes Aug 29 '20 at 13:38
  • 1
    @old_timer: Also, this is tagged x86, and everything I've said is true for the few mainstream x86 compilers. (Especially: don't use volatile unless you truly need it; `_Atomic` is usually better for everything except MMIO.). – Peter Cordes Aug 29 '20 at 13:39
  • llvm/clang is a mainstream compiler. – old_timer Aug 29 '20 at 13:40
  • just saying be careful with words like every, always, never, etc...Esp with languages like C which have so many "implementation defined" and other issues. – old_timer Aug 29 '20 at 13:41
  • 1
    @old_timer: Have you seen clang/llvm for x86 optimize away volatile accesses? clang can compile the Linux kernel, I thought, and I'm pretty sure `volatile` works as expected on clang in *most* cases. But ok, fair enough, I see your point about saying "always" if there are exceptions. – Peter Cordes Aug 29 '20 at 13:43
  • "_The as-if rule already requires compilers to make asm that gives observable results equivalent to stepping_" exactly, which is why MT is not defined, because stepping is not define in MT code! C/C++ need smthg like the (too) complicated Java semantics. (Even Java never managed to get the result they wanted!) – curiousguy Sep 08 '20 at 02:26
  • 1
    @curiousguy: Correct; that's why I linked [When to use volatile with multi threading?](https://stackoverflow.com/a/58535118) as part of my answer. You need `_Atomic` or `_Atomic volatile` for well-defined MT behaviour in the C abstract machine, `volatile` doesn't help at all in ISO C, only as a hack in practice on real hardware. – Peter Cordes Sep 08 '20 at 05:04