In this paper is the following example of a piece of code that can trigger a division-by-zero:
if (arg2 == 0)
ereport(ERROR, (errcode(ERRCODE_DIVISION_BY_ZERO),
errmsg("division by zero")));
/* No overflow is possible */
PG_RETURN_INT32((int32) arg1 / arg2);
ereport
here is a macro that expands to a call to a bool
-returning function errstart
that may or may not return and, conditional (using a ?:
) on its return value, a call to another function. In this case, I believe ereport
with level ERROR
unconditionally causes a longjmp()
someplace else.
Consequently, a naive interpretation of the above code is that, if arg2
is nonzero, the division will happen and the result will be returned, while, if arg2
is zero, an error will be reported and the division will not happen. However, the linked paper claims that a C compiler may legitimately hoist the division before the zero check, then infer that the zero check is never triggered. Their only reasoning, which seems incorrect to me, is that
[T]he programmer failed to inform the compiler that the call to ereport(ERROR, : : :) does not return. This implies that the division will always execute.
John Regehr has a simpler example:
void bar (void);
int a;
void foo3 (unsigned y, unsigned z)
{
bar();
a = y%z;
}
According to this blog post, clang hoists the modulo operation above the call to bar
, and he shows some assembly code to prove it.
My understanding of C as it applies to these snippets was that
Functions that do not, or may not, return are well-formed in standard C, and declarations of such require no particular attributes, bells, or whistles.
The semantics of a call to a function that do not, or may not, return are well-defined, in particular by 6.5.2.2 "Function calls" in C99.
Since the
ereport
invocation is a full expression, there is a sequence point at the;
. Similarly, since thebar
call in John Regehr's code is a full expression, there is a sequence point at the;
.There is consequently a sequence point between the
ereport
invocation orbar
call and the division or modulo.C compilers may not introduce undefined behaviour to programs that do not elicit undefined behaviour on their own.
These five points seem to be enough to conclude that the above division-by-zero test is correctly-written and that hoisting the modulo above the call to bar
is incorrect. Two compilers and a host of experts disagree. What is wrong with my reasoning?