I am trying to understand how the volatile keyword works in C++.
I had a look at What kinds of optimizations does 'volatile' prevent in C++?. Looking at the accepted answer, it looks like volatile
disables two kinds of optimizations
- Prevents compilers from caching the value in a register.
- Optimizing away accesses to that value when they seem unnecessary from the point of view of your program.
I found similar information at The as-if rule:
Accesses (reads and writes) to volatile objects occur strictly according to the semantics of the expressions in which they occur. In particular, they are not reordered with respect to other volatile accesses on the same thread.
I wrote a simple C++ program that sums all the values in an array to compare the behaviour of plain int
s vs. volatile int
s. Note that the partial sums are not volatile.
The array consists of unqualified int
s.
int foo(const std::array<int, 4>& input)
{
auto sum = 0xD;
for (auto element : input)
{
sum += element;
}
return sum;
}
The array consists of volatile int
s:
int bar(const std::array<volatile int, 4>& input)
{
auto sum = 0xD;
for (auto element : input)
{
sum += element;
}
return sum;
}
When I look at the generated assembly code, SSE registers are used only in the case of plain int
s. From what little I understand, the code using SSE registers is neither optimizing away the reads nor reordering them across each other. The loop is unrolled, so there aren't any branches either. The only reason I can explain why the code generation is different is: can the volatile reads be reordered before the accumulation happens? Clearly, sum
is not volatile. If such reordering is bad, is there a situation/example that can illustrate the issue?
Code generated using Clang 9:
foo(std::array<int, 4ul> const&): # @foo(std::array<int, 4ul> const&)
movdqu (%rdi), %xmm0
pshufd $78, %xmm0, %xmm1 # xmm1 = xmm0[2,3,0,1]
paddd %xmm0, %xmm1
pshufd $229, %xmm1, %xmm0 # xmm0 = xmm1[1,1,2,3]
paddd %xmm1, %xmm0
movd %xmm0, %eax
addl $13, %eax
retq
bar(std::array<int volatile, 4ul> const&): # @bar(std::array<int volatile, 4ul> const&)
movl (%rdi), %eax
addl 4(%rdi), %eax
addl 8(%rdi), %eax
movl 12(%rdi), %ecx
leal (%rcx,%rax), %eax
addl $13, %eax
retq