43

My toy compiler crashes if I divide by zero in a constant expression:

int x = 1 / 0;

Is this behaviour allowed by the C and/or C++ standards?

fredoverflow
  • 256,549
  • 94
  • 388
  • 662
  • 8
    So `x` is not a constexpr variable, just making sure but otherwise the C++ standard says `If the second operand of / or % is zero the behavior is undefined` – Shafik Yaghmour Nov 25 '15 at 12:08
  • Off-topic: How does it crash, by the way? Does it dereference a null pointer or something? – nurettin Nov 25 '15 at 12:13
  • @nurettin Well, it "crashes" in the sense that the user gets a popup message "Throwable: int overflow infinity" without giving any hint (line number or something) where the problem lies. It doesn't actually segfault or something :) – fredoverflow Nov 25 '15 at 12:16
  • 6
    I believe you cannot write a toy C++ compiler. Any C++11 conforming compiler won't be a toy; but you could write in C++ a toy compiler for your toy language! Then you are defining your language standard, and C++ standards does not matter! – Basile Starynkevitch Nov 25 '15 at 13:08
  • 1
    One variation that I would like the answerers to opine on is whether `if (0) { int x = 1 / 0; }` is allowed to crash the compiler or produce a program that crashes. – Pascal Cuoq Nov 25 '15 at 13:36
  • 2
    @PascalCuoq: http://stackoverflow.com/a/18385138/2003898 -> nope, it isn't allowed to crash, it is well defined. – dhein Nov 25 '15 at 14:03
  • 4
    afaik the standard does not say anything about compilers crashing, but about compiled programs crashing. I really dont think it is disallowed for a compiler to crash when he encounters e.g. `int x = 1*0;`, whether such a compiler is of any use is a different question :P – 463035818_is_not_an_ai Nov 25 '15 at 14:31
  • @nurettin: It crashes because integer divisions by zero raises an exception in the CPU and the kernel handles this exception by killing the application that raised it. Presumably the compiler is performing constant folding and is evaluating the division by zero, causing the crash. – Jordan Melo Nov 25 '15 at 14:51
  • 1
    The implementation is not required to issue a diagnostic message when it detects undefined behavior, but the standard does permit it, and does not specify what form a diagnostic message must take. I think it's perfectly reasonable to consider a compiler crash to be a particularly emphatic form of diagnostic message. :-) – Ray Nov 25 '15 at 22:03
  • @Zaibis see Pascal's answer to the question you linked to! (I disagree with his POV but he does explain his position well) – M.M Nov 25 '15 at 23:41
  • By definition, you *cannot* divide by zero in a constant expression. For C, [N1570](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf) 6.6p4: "Each constant expression shall evaluate to a constant that is in the range of representable values for its type." C++ has a similar rule. So `1 / 0` is an *expression*, but it's not a *constant expression* (even though its subexpressions are constant expressions). If you use `1 / 0` in a context that requires a constant expression, a conforming compiler must issue a diagnostic message. (Is a compile-time crash a diagnostic message?) – Keith Thompson Nov 26 '15 at 01:31
  • @KeithThompson well in this example a constant expression is not required, I think the OP meant was wrt to constant folding. I think the *constant expression* in the title in a red herring, though both operands on *integer constants* in C terminology and integer literals in C++. This usually allows constant folding opportunities. – Shafik Yaghmour Nov 26 '15 at 02:37
  • @KeithThompson there are lots of other questions, such as is undefined behavior really meant to a compile time concept. I have seen people claim it is a purely a run-time concept but I can find no wording in either standard that says that. – Shafik Yaghmour Nov 26 '15 at 02:40
  • @KeithThompson note that C11 also says *An implementation may accept other forms of constant expressions*. – Shafik Yaghmour Nov 26 '15 at 02:43
  • Hmm I didn't think of undefined behavior affecting compilers.. only the runtime behavior. Interesting. – Neil Kirk Nov 26 '15 at 02:58
  • @NeilKirk: http://stackoverflow.com/q/18385020/827263 – Keith Thompson Nov 26 '15 at 02:59
  • @KeithThompson I had forgotten about that question. Interesting to note that C++ does not define *behavior* on which your answer there partially depends on. C++ did not even [define indeterminate value until C++14](http://stackoverflow.com/a/23415662/1708801). – Shafik Yaghmour Nov 26 '15 at 03:46
  • The code snippet on its own is obviously not a valid program and no standard applies. Since the context (whether the division is attempted at runtime) decides whether this is undefined behaviour or not, you should provide some context. In general, crashing is not allowed by either C or C++ standards. – Remember Monica Nov 26 '15 at 09:56
  • 1
    You should clarify the constant expression portion of your question, as it has been made clear `1 / 0` is not a constant expression in either C or C++. Several people are hung up on this and it does not really change the answer. – Shafik Yaghmour Nov 26 '15 at 16:03
  • See my updated answer, for C we have defect report 109 which clarifies that unless we can prove the UB will be executed then we must successfully translate the program. Hat tip to Pascal for prodding me in that direction. – Shafik Yaghmour Dec 06 '15 at 03:20
  • Note the response to the UB mailing list from Richard Smith [here](http://www.open-std.org/pipermail/ub/2014-September/000515.html) which says *Also, ill-formed, NDR implies that all executions of the program have undefined behavior (if the compiler accepts it, which it's permitted to), even if they don't actually execute the UB* which means C++ diverges from C on this point wrt to DR 109. Not sure if this is spelled out in the normative text though. – Shafik Yaghmour Dec 07 '15 at 20:55

5 Answers5

40

Yes, division by zero is undefined behavior and neither the C nor C++ standard impose any requirements in such cases. Although in this case I believe you should at least issue a diagnostic(see below).

Before I go quoting the standards, I should note that although this may be conformant behavior quality of implementation is a different issue, being merely conforming is not the same as being useful. As far as I know the gcc, clang, Visual Studio and Intel(as per tpg2114) team consider internal compiler errors(ICEs) to be bugs that should be reported. It should be noted that both current gcc and clang produce a warning for this case seemingly regardless of flags provided. In the case where both operands are literals/constants, the case we have here, it seems rather straight forward to detect and provide a diagnostic for this. clang produces the following diagnostic for this case (see it live):

warning: division by zero is undefined [-Wdivision-by-zero]
int x = 1 / 0 ;
          ^ ~

From the draft C11 standard section 6.5.5 Multiplicative operators (emphasis mine):

The result of the / operator is the quotient from the division of the first operand by the second; [...] if the value of the second operand is zero, the behavior is undefined.

and so it is undefined behavior.

The draft C++ standard section 5.6 [expr.mul] says:

The binary / operator yields the quotient [...] If the second operand of / or % is zero the behavior is undefined [...]

again undefined behavior.

Both the draft C++ standard and draft C standard have a similar definition for undefined behavior both saying:

[...]for which this International Standard imposes no requirements

The phrase imposes no requirements seems too allow any behavior, including nasal demons. Both have a similar note saying something along the lines of:

Undefined behavior may be expected when this International Standard omits any explicit definition of behavior or when a program uses an erroneous construct or erroneous data. Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).

So although notes are not normative, it seems like if you are going to terminate during translation, you should at least issue a diagnostic. The term terminating is not defined, so it is hard to argue what this allows. I don't think I have seen a case where clang and gcc have an ICE without a diagnostic.

Does the code have to be executed?

If we read Can code that will never be executed invoke undefined behavior? we can see at least in case of C there is room for debate where the 1 / 0 has to be executed in order to invoke undefined behavior. What is worse in the C++ case the definition of behavior is not present so part of the analysis used for the C case can not be used for the C++ case.

It seems that if the compiler can prove the code will never be executed then we can reason that it would be as-if the program did not have undefined behavior but I don't think this is provable, just reasonable behavior.

From the C perspective WG14 defect report 109 further clarifies this. The following code example is given:

int foo()
{
  int i;
  i = (p1 > p2); /* Must this be "successfully translated"? */
  1/0; /* Must this be "successfully translated"? */
  return 0;
} 

and the response included:

Furthermore, if every possible execution of a given program would result in undefined behavior, the given program is not strictly conforming.
A conforming implementation must not fail to translate a strictly conforming program simply because some possible execution of that program would result in undefined behavior. Because foo might never be called, the example given must be successfully translated by a conforming implementation.

So in the case of C, unless it can be guaranteed that the code invoking undefined behavior will be executed then the compiler must successfully translate the program.

C++ constexpr case

If x was a constexpr variable:

constexpr int x = 1 / 0 ;

it would be ill-formed and gcc produces a warning and clang makes it error (see it live):

error: constexpr variable 'x' must be initialized by a constant expression
constexpr int x = 1/ 0 ;
             ^   ~~~~
note: division by zero
constexpr int x = 1/ 0 ;
                  ^
warning: division by zero is undefined [-Wdivision-by-zero]
constexpr int x = 1/ 0 ;
                  ^ ~

Helpfully noting that division by zero is undefined.

The draft C++ standard section 5.19 Constant expressions [expr.const] says:

A conditional-expression e is a core constant expression unless the evaluation of e, following the rules of the abstract machine (1.9), would evaluate one of the following expressions

and includes the following bullet:

an operation that would have undefined behavior [Note: including, for example, signed integer overflow (Clause 5), certain pointer arithmetic (5.7), division by zero (5.6), or certain shift operations (5.8) —end note ];

Is 1 / 0 a constant expression in C11

1 / 0 is not a constant expression in C11, we can see this from section 6.6 Constant expressions which says:

Each constant expression shall evaluate to a constant that is in the range of representable values for its type.

although, it does allow:

An implementation may accept other forms of constant expressions.

So 1 / 0 is not a constant expression in either C or C++ but that does not change the answer since it is not being used in a context that requires a constant expression. I suspect the OP meant that 1 / 0 is available for constant folding since both operands are literals, this would also explain the crash.

Community
  • 1
  • 1
Shafik Yaghmour
  • 154,301
  • 39
  • 440
  • 740
  • 1
    So while technically "crashing is allowed", it's way more user friendly to show a proper error message instead. (But I guess that's only an option for errors recognizable while compiling.) – Jongware Nov 25 '15 at 13:03
  • @Jongware I was literally writing a QOI paragraph while you made that comment. – Shafik Yaghmour Nov 25 '15 at 13:05
  • I can see that the end of your answer probably says it for C++, but what do you think of a program containing `if (0) { int x = 1 / 0; }` in C? – Pascal Cuoq Nov 25 '15 at 13:38
  • 2
    @Jongware Yes, but any error that would, without any code specifically to handle it, crash the compiler, by definition is recognizable while compiling. – Random832 Nov 25 '15 at 14:35
  • @Random832: I don't think so :) This particular construction can be recognized by a parser and so one could issue a proper error message. Anything else would amount to a `try .. catch` around the entire compiler and only tell you "something probably went wrong somewhere". – Jongware Nov 25 '15 at 15:02
  • @Jongware You could do a try..catch around the whole constant folding operation and note the line number. – Random832 Nov 25 '15 at 15:51
  • I would say there is a difference between the C language and a compiler program. A program produced by a C compiler will exhibit UB when division by zero occurrs. A compiler program that crashes on division by zero is NOT ub in the language sense; it is just a bug The compiler might even be written in Fortan). – Paul Ogilvie Nov 25 '15 at 15:55
  • @PaulOgilvie the notes to undefined behavior in both C and C++ say *to terminating a translation or execution* now I agree that an ICE is always considered a bug but I believe it is still conforming. – Shafik Yaghmour Nov 25 '15 at 15:56
  • @ShafikYaghmour, and what if the compiler is written in Fortran? UB as quoted here only pertains to the C language and hence to programs written in that language. If the compiler is written in C, then the quotes are correct but do not pertain to the OBs program but to the compiler. "Terminating a translation" is a graceful termination, not a crash. – Paul Ogilvie Nov 25 '15 at 16:00
  • @PaulOgilvie the problem is that the standard says *imposes no requirements* and although one could argue your way the standard does not define *terminating* and so perhaps these are possible defects. I can see that line of reasoning. The standard does not really confine UB to run-time and I don't think it can since a lot of optimization decisions and constexpr behavior depends on compiler choices wrt to UB. I agree some clarification is needed though. – Shafik Yaghmour Nov 25 '15 at 16:34
  • 1
    Once I asked a question about the [behaviour of the compiler when UB is present](http://stackoverflow.com/questions/32154832/is-the-behaviour-of-the-compiler-undefined-with-undefined-behaviour), new answers would be very welcome... – alain Nov 25 '15 at 16:38
  • Doesn't this only apply to the *evaluation* of the `/` operator? It's not UB if the expression is never evaluated. For example, the standard [explicitly notes](http://port70.net/~nsz/c/c11/n1570.html#note118) that `2 || 1 / 0` is defined behavior, even though it contains as a subexpression the expression `1 / 0`. – user2357112 Nov 25 '15 at 17:15
  • Also, as part of the constraints of "constant expression", it says "Each constant expression shall evaluate to a constant that is in the range of representable values for its type." Since this is a constraint, that means if an expression doesn't evaluate to a representable value, it's not a constant expression. **1 / 0 is thus not a constant expression.** – user2357112 Nov 25 '15 at 17:24
  • Intel also requests that ICE's are reported as bugs (to add to your list of those who consider it bugs). – tpg2114 Nov 25 '15 at 18:09
  • @PascalCuoq I think the standard does not specify this, we can make some guesses, Keith's and your answer to the linked question I quote and mentioned in other comments gives a good stab but clearly it is not specified. I think it is reasonable to say if the compiler can prove it won't be executed then it could act *as-if* there was no UB but we can't prove that is required. – Shafik Yaghmour Nov 28 '15 at 19:08
  • 1
    @ShafikYaghmour I would approach the question from the other direction. In the presence of `int main(int c, char*v[]) { if (!exp1) { int y = 1 / exp2; } }` the compiler has to behave and to generate code that behaves as long as the compiler cannot prove that `exp1` and `exp2` are never 0 simultaneously, because the program can be invoked with arguments that make only one or the other 0 and the programmer has the right to expect the program to have been generated and to work then. I do not see why there should be a discontinuity when exp1 and exp2 are compile-time constants. – Pascal Cuoq Nov 28 '15 at 19:29
  • 1
    @PascalCuoq hmmm, this is an interesting perspective and [defect report 109](http://www.open-std.org/jtc1/sc22/wg14/docs/rr/dr_109.html) agrees with you on this. – Shafik Yaghmour Dec 06 '15 at 03:10
  • @PascalCuoq interesting this ub mailing list [response](http://www.open-std.org/pipermail/ub/2014-September/000515.html) sure makes it seems like in C++ the approach is different. – Shafik Yaghmour Dec 06 '15 at 03:28
  • The published Rationale for the C Standard goes into further detail about UB, and recognizes a compiler's choice of how to behave when the Standard imposes no requirements as a "quality of implementation" issue, noting elsewhere that an implementation could be conforming and yet be of sufficiently poor quality as to be useless. – supercat Aug 28 '18 at 19:11
24

The mere presence of 1 / 0 does not permit the compiler to crash. At most, it is permitted to assume that the expression will never be evaluated, and thus, that execution will never reach the given line.

If the expression is guaranteed to be evaluated, the standard imposes no requirements on the program or compiler. Then the compiler can crash.

1 / 0 is only UB if evaluated.

The C11 standard gives an explicit example of 1 / 0 being defined behavior when unevaluated:

Thus, in the following initialization,

        static int i = 2 || 1 / 0;

the expression is a valid integer constant expression with value one.

Section 6.6, footnote 118.

1 / 0 is not a constant expression.

Section 6.6 of the C11 standard, under Constraints, says

  1. Constant expressions shall not contain assignment, increment, decrement, function-call, or comma operators, except when they are contained within a subexpression that is not evaluated.
  2. Each constant expression shall evaluate to a constant that is in the range of representable values for its type.

Since 1/0 does not evaluate to a constant in the range of values representable by an int, 1/0 is not a constant expression. This is a rule about what counts as a constant expression, like the rule about not having assignments in it. You can see that at least for C++, Clang doesn't consider 1/0 a constant expression:

prog.cc:3:18: error: constexpr variable 'x' must be initialized by a constant expression
   constexpr int x = 1/ 0 ;
                 ^   ~~~~

It wouldn't make much sense for an unevaluated 1 / 0 to be UB.

(x == 0) ? x : 1 / x is perfectly well-defined, even if x is 0 and evaluating 1/x is UB. If it were the case that (0 == 0) ? 0 : 1 / 0 were UB, that would be nonsense.

user2357112
  • 260,549
  • 28
  • 431
  • 505
  • 3
    Another example that the OP should consider is `void f() { int i = 1 / 0; }`, which may occur in a strictly conforming program that never calls `f`. Just in case the OP was considering trying to fix the bug by adding simple dead code analysis. –  Nov 25 '15 at 19:45
  • If it is guaranteed that program flow will cause UB, then the behaviour of the entire program is undefined. (But the compiler still should not crash of course). – M.M Nov 25 '15 at 23:36
  • 3
    I think this answer is great and all 3 points are correct, but I don't agree with the conclusion in the context of the question. If we assume that `1 / 0` *is* evaluated and there is UB, then standard does not give any requirements on behaviour of program *or compiler*. While it's not good to crash compiler, it's still valid thing to do. Expression not being a constant expression is irrelevant in this case. – user694733 Nov 26 '15 at 09:19
  • @user694733: The way I read the question, there's no guarantee the expression will be evaluated. I suppose it's worth adding a note about what happens if the expression is guaranteed to be evaluated. – user2357112 Nov 26 '15 at 18:49
  • @user2357112: Although it is common for implementations to convert source text to an executable which may be executed at leisure, the Standard regards an "implementation" as being something which accepts the source code, converts it into some other form if convenient, and then executes it. I don't think the Standard guarantees a "sequence point" between the end of translation and the start of execution. – supercat Aug 28 '18 at 19:07
15

From C standard draft (N1570):

6.5.5 Multiplicative operators

...

  1. The result of the / operator is the quotient from the division of the first operand by the second; the result of the % operator is the remainder. In both operations, if the value of the second operand is zero, the behavior is undefined.

And about undefined behaviour in chapter 3. Terms, definitions, and symbols:

3.4.3

  1. undefined behavior
    behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements
  2. NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).

So crashing the compiler is allowed.

user694733
  • 15,208
  • 2
  • 42
  • 68
  • It might be worth also noting about just the presence of undefined constructs causes UB, or it has to be able to be invoked to cause it ;) – dhein Nov 25 '15 at 14:30
  • Well it is not directly stated but I would say the opposite. regarding "behavior, **upon use of** a nonportable or erroneous program construct". And in Section 3.4 behavior is defined as: "external appearance or action" where for calling something undefined behavior it has to be fulfilling the requirments of behavior. Where the compiler had to prove before it, at least COULD if not even CAN appear. so it fulfills the state of appearing or acting. But that is just an interpretation that sounds logical from my point of view. – dhein Nov 25 '15 at 15:19
  • 3
    "From the wording on standard it seems to me that invoking is not necessary. Either there is UB or not." -- There is UB if the abstract machine would divide by zero, if the `/` operator with a RHS of `0` would ever be evaluated. Not if the code never even attempts to divide by zero, not even if there is a division by zero in dead code. UB is unconditional in one sense, if there is UB anywhere then the whole program has UB and the compiler may crash or otherwise reject the program, but depending on the type of UB, whether there is UB anywhere may depend on the would-be-UB code being reached. –  Nov 25 '15 at 19:51
  • @hvd It seems you are right, unevaluated code is not necessarily UB as highlighted by user2357112's answer. – user694733 Nov 26 '15 at 08:43
2

Others have already mentioned the relevant text from the standards, so, I'm not going to repeat that.

My C compiler's expression evaluating function takes an expression in Reverse Polish Notation (array of values (numbers and identifiers) and operators) and returns two things: a flag for whether or not the expression evaluates to a constant and the value if it's a constant (0 otherwise). If the result is a constant, the whole RPN reduces to just that constant. 1/0 is not a constant expression since it doesn't evaluate to a constant integer value. The RPN is not reduced for 1/0 and stays intact.

In C, static variables can be initialized with constant values only. So, the compiler errors out when it sees that an initializer for a static variable is not a constant. Variables of automatic storage can be initialized with non-constant expressions. In this case my compiler generates code to evaluate 1/0 (it still has the RPN for this expression!). If this code is reached at runtime, UB occurs as prescribed by the language standards. [On x86 this UB takes on the form of the division by zero CPU exception, while on MIPS this UB yields an incorrect quotient value (the CPU does not have a division by zero exception).]

My compiler properly supports short-circuiting in ||-expressions and &&-expressions. So, it evaluates 1 || 1/0 as 1 and 0 && 1/0 as 0, regardless of whether or not the right-hand operand of the logical operator is a constant. The expression evaluating function removes the right-hand operands of these operators (along with the operators) when they must not be evaluated and so 1 || 1/0 transforms into 1 != 0 (recall that the operands of && and || undergo comparison with 0), which yields 1 and 0 && 1/0 transforms into 0 != 0, which yields 0.

Another case to take care of is INT_MIN / -1 and INT_MIN % -1 (ditto for larger integer types). The quotient is not representable as a signed int (in the case of 2's complement signed integers, which is what we have in all modern CPUs) and so this is UB as well (you get the same division by zero exception on x86 at runtime). I handle this case similarly. This expression can't initialize a variable of static storage and it's thrown away if it's not evaluated in the logical &&/|| operator. It can initialize an automatic variable, possibly leading to UB at runtime.

I also issue a warning when such division is encountered.

Alexey Frunze
  • 61,140
  • 12
  • 83
  • 180
  • I suppose RPN for expressions is simpler and more efficient than an AST? – fredoverflow Dec 02 '15 at 10:02
  • @fredoverflow Well, it has its upsides and downsides. It needs little memory, it's easy to slap a unary operator on top of an expression or a binary operator with a right operand and it's easy to generate code (poor but functional) from. However, traversing it as a tree is more difficult and so is transforming it (especially inserting, deleting or rotating nodes; all the usual problems with arrays). One unexpected benefit from RPNs in arrays may be that your compiler doesn't need to support structures (to implement trees with interlinked nodes) to be able to compile itself. – Alexey Frunze Dec 02 '15 at 10:42
-1

How the compiler should behave is unrelated to the value of the expression. The compiler should not crash. Period.

I imagine that a pedantic implementation, given an expression like this, would compile to code that will execute 1/0 at run time, but I don't think that would be seen as a good feature.

So the remaining space is that the compiler should decline to compile it, and treat it as some class of source code error.

ddyer
  • 1,792
  • 19
  • 26