Consider the following code:
#include <string_view>
constexpr std::string_view f() { return "hello"; }
static constexpr std::string_view g() {
auto x = f();
return x.substr(1, 3);
}
int foo() { return g().length(); }
If I compile it with GCC 10.2, and flags --std=c++17 -O1
, I get:
foo():
mov eax, 3
ret
also, to my knowledge, this code does not suffer from any undefined behavior issues.
However - if I add the flag -fsanitize=undefined
, the compilation result is:
.LC0:
.string "hello"
foo():
sub rsp, 104
mov QWORD PTR [rsp+80], 5
mov QWORD PTR [rsp+16], 5
mov QWORD PTR [rsp+24], OFFSET FLAT:.LC0
mov QWORD PTR [rsp+8], 3
mov QWORD PTR [rsp+72], 4
mov eax, OFFSET FLAT:.LC0
cmp rax, -1
jnb .L4
.L2:
mov eax, 3
add rsp, 104
ret
.L4:
mov edx, OFFSET FLAT:.LC0+1
mov rsi, rax
mov edi, OFFSET FLAT:.Lubsan_data154
call __ubsan_handle_pointer_overflow
jmp .L2
.LC1:
.string "/opt/compiler-explorer/gcc-10.2.0/include/c++/10.2.0/string_view"
.Lubsan_data154:
.quad .LC1
.long 287
.long 49
See this on Compiler Explorer.
My question: Why should the sanitization interfere with the optimization? Especially since the code doesn't seem to have any UB hazards...
Notes:
- I suspect a GCC bug, but maybe I have the wrong perception of what the UBsan does.
- Same behavior if I set
-O3
. - With no optimization flags, the longer code is produced both with and without sanitization.
- If you declare
x
to be aconstexpr
variable, the sanitization doesn't prevent the optimization. - Same behavior with C++17 and C++20.
- With Clang, you get this discrepancy as well, but only with a higher optimization setting (e.g.
-O3
).