I have already searched for some answers on Google and Stack Overflow, and I am aware that compilers cannot assume that functions won't modify arguments passed through const references, as these functions might obtain a non-const reference via const_cast
. However, doing this is undefined behavior when the original object itself is defined as const
. From cppreference
Modifying a const object through a non-const access path and referring to a volatile object through a non-volatile glvalue results in undefined behavior.
For the following code
void fun(const int&);
int f1() {
const int i = 3;
fun(i);
return i;
}
static int bar(const int i) {
fun(i);
return i;
}
int f2() {
return bar(3);
}
Both GCC and Clang are capable of optimizing the function f1()
to directly return 3
, as the compilers consider that calling fun(i)
won't modify the value of i
, since such an action would result in undefined behavior. However, both GCC and Clang are unable to apply the same optimization to the function f2()
. The compilers still generate code to load the value of i
from memory. Below is the code for f1()
and f2()
generated by GCC. Compiler Explorer
f1():
subq $24, %rsp
leaq 12(%rsp), %rdi
movl $3, 12(%rsp)
call fun(int const&)
movl $3, %eax ! <-- Returns 3 directly.
addq $24, %rsp
ret
f2():
subq $24, %rsp
leaq 12(%rsp), %rdi
movl $3, 12(%rsp)
call fun(int const&)
movl 12(%rsp), %eax ! <-- Load the return value from memory.
addq $24, %rsp
ret
Even though the standard does not require that compilers must perform such optimizations, I believe compilers should have the capability to optimize f2()
to directly return 3
as well. In my view, this would result in more efficient code (please correct me if I'm mistaken). When the compiler inlines the calling bar(3)
into function f2()
, it should be able to deduce that calling fun(i)
will not modify the value of i
.
Continuing with another example. When I replace the variable i
in function f1()
with a class type, Clang is still able to optimize it to return 3
. However, GCC opts to load the return value from memory instead:
struct A {
int i;
};
void fun(const A&);
int f3() {
const A a{3};
fun(a);
return a.i;
}
Here is the code generated by GCC:
f3():
subq $24, %rsp
leaq 12(%rsp), %rdi
movl $3, 12(%rsp)
call fun(A const&)
movl 12(%rsp), %eax ! <-- Load the return value from memory.
addq $24, %rsp
ret
and Clang:
f3():
pushq %rax
movl $3, (%rsp)
movq %rsp, %rdi
callq fun(A const&)@PLT
movl $3, %eax ! <-- Returns 3 directly.
popq %rcx
retq
Why doesn't GCC optimize the function to directly return 3? Is it because GCC considers loading the return value from memory to be equally efficient as directly returning a constant?