I'm going to focus on practical differences in kinds of UB, not just how the ISO standard rates them.
Related: What Every C Programmer Should Know About Undefined Behavior (http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html) has some good stuff about how data-dependent UB allows compilers to assume things won't happen, optimizing the asm. (The ISO C standard has nothing to say about what should happen for paths of execution that do encounter UB, so whatever ends up actually happening in such cases is fine.) e.g. that's why for(int i = 0 ; i<=n ; i++)
can be assumed to be non-infinite because signed int
overflow isn't defined the way unsigned
is. So n = INT_MAX
would lead to UB. But the equivalent with unsigned
is potentially infinite for n=UINT_MAX
.
You're talking about compile-time-visible UB, though, which you could say is "more undefined" because compilers can notice it and do things on purpose. (For example, emit an illegal instruction to make the program fault if it gets to this point.)
Contrast this with cases where the compiler just optimizes according to any guarantees / assumptions it's allowed to make, for code where it doesn't notice the UB at compile time, or the UB only happens with some possible function args so the compiler still has to make asm that works for inputs that don't lead to UB in the abstract machine.
Some interesting examples of non-runtime-visible UB:
For example, in C++ (unlike C1) it's illegal for execution to fall off the end of a non-void
function. Modern GCC and clang optimize accordingly, assuming such paths of execution are never reached and emitting no instructions at all for them, not even a ret
.
Let's make some simple examples on Godbolt compiling for x86-64:
int x, y; // global vars; compiler has to assume assigning to these is a visible side-effect that code in other compilation units could see.
int bad_int_func() {
x = 0; // gcc still stores but no ret
y = 0; // clang backtracks and emits no instructions for the whole block
// return 0;
}
compiles like this with GCC11.2 -O2:
bad_int_func():
mov DWORD PTR x[rip], 0
mov DWORD PTR y[rip], 0
# missing ret, there'd be one if the function was void
Clang is even more aggressive: we just get the label and no instructions. It didn't emit code for any of the earlier C++ statements in this basic block (sequence of code with no branches or branch targets) that ends by falling off the end.
And yes, both compilers warn about that, e.g. gcc warns twice (even without -Wall
or -Wextra
): warning: no return statement in function returning non-void [-Wreturn-type]
and warning: control reaches end of non-void function [-Wreturn-type]
Where this gets more interesting is in a function with some branches so it's possible for it to be called safely, as part of a well-behaved program. (It would be poor style to write a function exactly like this, but with more complex things, maybe a switch
or if(foo) ... return
/ if (bar) ... return
/ ..., there can be some paths that the compiler can't prove are never taken in the first place.)
int foo(int a){
y = 0;
if (a == 0) {
y = 1;
return 0;
}
// x = 2; // if uncommented, GCC does branch. clang doesn't care
// return a; // entirely changes the function vs. no return
}
The only legal path of execution through this function is with a==0
, so GCC and clang simply assume that's the case, optimizing away the branch:
foo(int): # @foo(int)
mov dword ptr [rip + y], 1
xor eax, eax
ret
Of course, this compiles very differently if you compile as C so it's legal to fall off the end without returning a value:
# clang13 -O2 -xc
foo:
xor eax, eax
test edi, edi
sete al # tmp = (a == 0)
mov dword ptr [rip + y], eax
xor eax, eax # unconditionally return 0
ret
Since the fall-through path doesn't have a return statement, it doesn't matter what's in the return-value register at that point. Setting it to 0 is what we need for the if()
body's return statement, and just unconditionally doing that is much cheaper than compare-and-branching to see if we should zero it or not. (GCC doesn't spot that, and does branch over a y=1
store and an xor eax,eax
. It unconditionally did the y=0
store first, even though it did tail duplication with two separate ret
instructions :/)
Of course if inlining into a caller that did use the return value, then the same optimizations as with C++ would apply. Like if you put a __builtin_unreachable()
there instead of a return
.
Footnote 1: In C, it's only UB for the caller to use the return value of such a function, for historical reasons: before void
was introduced to the language, every function had a return value, often implicit int
. But people didn't bother to put a return 0;
at the bottom if they didn't want a return value. (Not just "didn't bother"; saving those bytes of compiler-generated machine code was probably valuable on tiny old machines.)