14

According to the C++ Standard, it is mandatory for every implementation to document "implementation-defined behavior":

1.3.11 [defns.impl.defined] implementation-defined behavior

behavior, for a well-formed program construct and correct data, that depends on the implementation and that each implementation documents

And reading an invalid pointer value has implementation-defined behavior (see 4.1 Lvalue-to-rvalue conversion [conv.lval]):

if the object to which the glvalue refers contains an invalid pointer value (3.7.4.2, 3.7.4.3), the behavior is implementation-defined.

(quote from draft n4527, but verbiage that "Indirection through an invalid pointer value and passing an invalid pointer value to a deallocation function have undefined behavior. Any other use of an invalid pointer value has implementation-defined behavior." has been in 3.7.4.2 Deallocation functions [basic.stc.dynamic.deallocation] since at least draft n3485)

However, many popular implementations do not define this behavior, and many experts describe this as "undefined behavior" instead.

A likely cause for omission of clear documentation is that, as far as I can determine, evaluation of "invalid pointer values" is missing from the "Index of implementation-defined behavior" which appears in Standard drafts following the appendices.

Is this a defect in the Standard, and are there any open Defect Reports or committee actions taken since C++14 concerning it?

Community
  • 1
  • 1
Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
  • Seems to me the part that implementations need to define is this - *Some implementations might define that copying an invalid pointer value causes a system-generated runtime fault.* (footnote under §3.7.4.3). You should be able to determine from your processor's documentation whether specific bit patterns loaded into some address register will cause a fault. – Praetorian Oct 28 '15 at 16:04
  • 4
    So your question is why is this not documented in the *Index of implementation-defined behavior*? – Shafik Yaghmour Oct 28 '15 at 16:07
  • @Praetorian: Behavior of C++ programs is most certainly not identical to behavior documented by the CPU vendor (consider a loop condition that terminates only upon signed-integer wraparound -- the CPU may well define that wraparound occurs, but the C++ optimizer is perfectly entitled to treat it as an infinite loop) – Ben Voigt Oct 28 '15 at 16:08
  • @ShafikYaghmour: Essentially, yes. It could be that it isn't there because it's actually meant to be undefined behavior, and the other sections have a typo. Or maybe there's already a suggested rewording that the committee hasn't yet voted on. Or maybe it IS in the index, using a phrasing I didn't recognize as being related. These are the types of things I'm trying to learn. – Ben Voigt Oct 28 '15 at 16:09
  • I have always found "implementation-defined behavior" a strange concept, as it only constrains compiler documentation authors and not programmers or compiler writers. (Either you are coding portably, in which case "implementation-defined" and "undefined" behaviors must be avoided equally; or you are coding for a specific platform, in which case all you care about is what your compiler guarantees and the spec's distinctions are irrelevant.) – Nemo Oct 28 '15 at 16:51

1 Answers1

9

CWG #1438 changed the semantics concerning invalid pointer values:

The current Standard says that any use of an invalid pointer value produces undefined behavior (3.7.4.2 [basic.stc.dynamic.deallocation] paragraph 4). This includes not only dereferencing the pointer but even just fetching its value. The reason for this draconian restriction is that some architectures in the past used dedicated address registers for pointer loads and stores and they could fault if, for example, a segment number in a pointer was not currently mapped.

It is not clear whether such restrictions are necessary with architectures currently in use or reasonably foreseen. This should be investigated to see if the restriction can be loosened to apply only to dereferencing the pointer.

The change in [conv.lval] is the resolution of CWG #616, which essentially adopted the above.
Lifting this from UB to implementation-defined behavior was intentional, so I presume the absence of this paragraph in the Index is an oversight.

Columbo
  • 60,038
  • 8
  • 155
  • 203