When working with multi-threading, explicitness is good. So I'm going to break down each and every piece.
"Will dereferencing pointers always cause memory access."
No. Consider the expression statement (void)*p
. The *p
performs indirection. From [expr.unary.op]:
The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an
object type, or a pointer to a function type and the result is an lvalue referring to the object or function
to which the expression points.
So the result is an lvalue reference. That, on its own, is not sufficient to cause a "read" of the data pointed to by p. In the above example, I explicitly throw away the result, so there's no reason to read the memory.
Of course, one might argue that the memory of p
is read. Just to be pedantic, I'd point out that's one interpretation of the word. However, an optimizing compiler can see that the lvalue pointed to by p
is not needed here, so it doesn't actually need to read/write the pointer at all.
Now what about in a multithread environment? The key to this is the "happens-before" relationship in [intro.multithread]. It's incredibly dry formal language, but the basic idea is that event A happens before event B if A is sequenced before B (in a single thread), or if A inter-thread-happens-before B. The latter is the fancy language lawyer speak for a tool used to capture the behavior of both synchronization primitives like mutexes and atomics.
If A does not happens-before B and B does not happens-before A, then the two events are not ordered with respect to eachother. This is what happens on two threads when you don't have anything like mutexes to force an ordering. If one event writes to a memory location and the other reads or writes to that address, the result is a data race. And a data race is undefined behavior: you get what you get. The spec does not have anything to say about what happens when that occurs. It doesn't say anything about whether it triggers a memory access or not... it says absolutely nothing about it.
As an effect of the rules codified in [intro.multithread], the compiler is effectively allowed to optimize its code as if a thread was operating in complete isolation unless a threading primitive (such as a mutex or atomic) forces otherwise. This includes all the usual elisions, such as not reading from memory if you don't have to.