1

As user Tony points out there's a [Note] in paragraph 1.3.12 of C++ standard saying

permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment...

Doesn't this contradict the definition of UB saying that ...this International Standard imposes no requirements? I mean they say "no requirements" and then say "permissible UB" - right in the same paragraph.

How should this note be interpreted? Does it indeed limit UB in any way?

Community
  • 1
  • 1
sharptooth
  • 167,383
  • 100
  • 513
  • 979

3 Answers3

7

From §6.5.1 of Part 3 of the ISO/IEC Directives:

Notes and examples integrated in the text of a standard shall only be used for giving additional information intended to assist the understanding or use of the standard and shall not contain provisions to which it is necessary to conform in order to be able to claim compliance with the standard.

So it's entirely non-normative (non-binding) and meant only for possible clarification.

Matthew Flaschen
  • 278,309
  • 50
  • 514
  • 539
6

As notes are not normative it doesn't limit UB in any way. It's just a clarification that an implementation could use some constructs that formally cause UB as a documented extension, although any program that relies on such a detail is, of course, inherently not safely portable to other environments.

CB Bailey
  • 755,051
  • 104
  • 632
  • 656
2

This note is explaining what an implementation might do if it encounters code for which there is no defined behaviour. The word "permissible" is not intended to be a restriction, rather some examples of common behaviours are given.

It is interesting to note that a compiler almost always HAS to compile something! Consider this fragment of code:

void f() { 1 / 0; }

the behaviour of the translator on encountering this is not well defined, but it can't just do anything it likes! In fact if it is a compiler it is still required to compile this compilation unit. That is because the behaviour of a program containing this function could still be well defined! The compiler cannot know if the function is called. In fact this question arose where the function was "main()" and control was certain to flow through the zero division, and the upshot is that the compiler is not allowed to reject even that program. The reason is: the program is still well formed, and a conforming compiler is required to accept all well formed programs (and reject all ill-formed ones and issue a diagnostic error message, unless otherwise specified).

This can't easily be made ill-formed because it is hard to specify exactly how smart compilers can be required to be in detecting when a division by zero must take place.

So interestingly, the claim of the Standard that it "Imposes no requirements" is in fact very close to being wrong. It is characteristics of a compilation system supporting separate compilation that it cannot detect if a piece of code for which there is no well defined behaviour is, in fact, executed, and so the compiler is in fact required to compile something anyhow, because it cannot deduce if the program has undefined behaviour.

Yttrill
  • 4,725
  • 1
  • 20
  • 29
  • Unfortunately, if a compiler can determine that the only way for a function to avoid Undefined Behavior would be to call another function that might possibly exit the program before the point of Undefined Behavior is reached, the compiler would be allowed to make the outer function call the inner function unconditionally. – supercat Apr 23 '15 at 16:48
  • Why is that unfortunate? – Yttrill Apr 24 '15 at 17:17
  • It is unfortunate because it means that in many scenarios where a programmer would be happy with *almost* anything that a compiler might do in case of e.g. an arithmetic overflow, the programmer must add code to deterministically prevent the overflow from occurring. Such code will add extra complexity to the program, making it harder to write and to read, and will also in many cases impair optimizations which the programmer would have been perfectly happy with. If a program could specify "I need the compiler to limit the consequences of overflow to... – supercat Apr 24 '15 at 17:32
  • ...any arbitrarily chosen one of the following", and then know that `x=y+z` wouldn't revoke the laws of causality, not only would such an expression would be easier to write and read then `x = (int)((unsigned)y+z);`, but depending upon what options the programmer had decided to accept it could allow many more optimization opportunities than the latter form. For example, if a programmer said it was acceptable for overflow to yield Partially-Indeterminate Value (which may be behave as an arbitrary-precision integer whose lower 32 bits are correct and upper bits could be anything), then... – supercat Apr 24 '15 at 17:35
  • ...a compiler could safely assume that if `y` was greater than zero, `y+z` would be greater than `z` (an assumption that would be expressly forbidden if code used the `unsigned` cast). I suppose a compiler that was given `s < 32 ? v>>s : __INDETERMINATE ? 0 : v>>(s & 31)` might figure out that it could implement that behavior using a single shift instruction, but I don't see how that's really an improvement over `v >> s`. – supercat Apr 24 '15 at 17:42
  • Well, what you want is available by QoI, Quality of Implementation. Whilst the Standard doesn't require anything, individual compiler vendors are free to specify something, indeed, switches to select behaviour. In fact this freedom can only come precisely because the standard makes no requirement. – Yttrill Apr 25 '15 at 23:20
  • The problem is, many circumstances where behaviour is not defined are hard to detect statically, especially things like overflow, array bounds violations, etc. Indeed, allow me to refer yo to the language ATS, Applied Dependent Typing, by Howgwei Xi, which provides facilities for the programmer to use dependent typing to ensure at compiler time there are no array bounds breaches. The programmer must provide the proofs using the type system. – Yttrill Apr 25 '15 at 23:24
  • Historically, it used to be that a quality C compiler would endeavor to provide stronger guarantees than the language standard itself required; languages weren't required to do anything useful for any forms of Undefined Behavior, but imparting useful behavior to some, useful trapping for others, or for some perhaps a selectable choice between them, was considered better than random behavior. From a purely-requirements-based perspective, saying "Implementations must specify anything left-shift of a negative number may do, but such specification could include Undefined Behavior"... – supercat Apr 26 '15 at 04:17
  • ...would be really no different from saying "left-shifts of negative numbers invoke Undefined Behavior", since the most the Standard would require of any implementation would be a piece of documentation, and not any sort of predictability. On the other hand, from a normative perspective, I don't think the standard writers ever thought Undefined Behavior was a good thing; some systems might have poorly-designed trap mechanisms that would prevent compilers from offering anything better. The set of actions that cause UB are a lousy basis for inference-based optimization, since... – supercat Apr 26 '15 at 04:23
  • ...there are many cases where programmers would be e.g. happy with having overflow yield a partially-indeterminate value or recognizable trap, but cannot accept Undefined Behavior. Forcing programmers to write code to prevent overflow will make code programs run more slowly than they could have been if overflows had been allowed to yield a partially-Indeterminate value. Likewise there are many systems where a slightly relaxed version of the aliasing rule could have been adopted, which would say that aliasing effectively creates additional variables, and any particular write or read may... – supercat Apr 26 '15 at 04:32
  • ...arbitrarily access any of them. This could have been especially useful if there were a construct one could use which would say that certain variables which are read in a loop must be reloaded from memory at least once every N times through the loop. If e.g. N is 5 and a loop gets unrolled 10 times, a compiler would have to refresh the items twice in the resulting loop, rather than 10 times. Major win, but only meaningful if an implementation defines the effect of aliasing. – supercat Apr 26 '15 at 04:35