11

I was reading a post on some nullptr peculiarities in C++, and a particular example caused some confusion in my understanding.

Consider (simplified example from the aforementioned post):

struct A {   
    void non_static_mem_fn() {}  
    static void static_mem_fn() {}  
};


A* p{nullptr};

/*1*/ *p;
/*6*/ p->non_static_mem_fn();
/*7*/ p->static_mem_fn();

According to the authors, expression /*1*/ that dereferences the nullptr does not cause undefined behaviour by itself. Same with expression /*7*/ that uses the nullptr-object to call a static function.

The justification is based on issue 315 in C++ Standard Core Language Closed Issues, Revision 100 that has

...*p is not an error when p is null unless the lvalue is converted to an rvalue (7.1 [conv.lval]), which it isn't here.

thus making a distinction between /*6*/ and /*7*/.

So, the actual dereferencing of the nullptr is not undefined behaviour (answer on SO, discussion under issue 232 of C++ Standard, ...). Thus, the validity of /*1*/ is understandable under this assumption.

However, how is /*7*/ guaranteed to not cause UB? As per the cited quote, there is no conversion of lvalue to rvalue in p->static_mem_fn();. But the same is true for /*6*/ p->non_static_mem_fn();, and I think my guess is confirmed by the quote from the same issue 315 regarding:

/*6*/ is explicitly noted as undefined in 12.2.2 [class.mfct.non-static], even though one could argue that since non_static_mem_fn(); is empty, there is no lvalue->rvalue conversion.

(in the quote, I changed "which" and f() to get the connection to the notation used in this question).


So, why is such a distinction made for p->static_mem_fn(); and p->non_static_mem_fn(); regarding the causality of UB? Is there an intended use of calling static functions from pointers that could potentially be nullptr?


Appendix:

Boann
  • 48,794
  • 16
  • 117
  • 146
Anton Menshov
  • 2,266
  • 14
  • 34
  • 55
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/219712/discussion-on-question-by-anton-menshov-why-is-dereferencing-of-nullptr-while-us). – Machavity Aug 13 '20 at 00:04
  • Does this answer your question? [c++ access static members using null pointer](https://stackoverflow.com/questions/28482809/c-access-static-members-using-null-pointer) – Passer By Aug 17 '20 at 09:58

1 Answers1

6

Standard citations in this answer are from the C++17 spec (N4713).

One of the sections cited in your question answers the question for non-static member functions. [class.mfct.non-static]/2:

If a non-static member function of a class X is called for an object that is not of type X, or of a type derived from X, the behavior is undefined.

This applies to, for example, accessing an object through a different pointer type:

std::string foo;

A *ptr = reinterpret_cast<A *>(&foo); // not UB by itself
ptr->non_static_mem_fn();             // UB by [class.mfct.non-static]/2

A null pointer doesn't point at any valid object, so it certainly doesn't point to an object of type A either. Using your own example:

p->non_static_mem_fn(); // UB by [class.mfct.non-static]/2

With that out of the way, why does this work in the static case? Let's pull together two parts of the standard:

[expr.ref]/2:

... The expression E1->E2 is converted to the equivalent form (*(E1)).E2 ...

[class.static]/1 (emphasis mine):

... A static member may be referred to using the class member access syntax, in which case the object expression is evaluated.

The second block, in particular, says that the object expression is evaluated even for static member access. This is important if, for example, it is a function call with side effects.

Put together, this implies that these two blocks are equivalent:

// 1
p->static_mem_fn();

// 2
*p;
A::static_mem_fn();

So the final question to answer is whether *p alone is undefined behavior when p is a null pointer value.

Conventional wisdom would say "yes" but this is not actually true. There is nothing in the standard that states dereferencing a null pointer alone is UB and there are several discussions that directly support this:

  • Issue 315, as you have mentioned in your question, explicitly states that *p is not UB when the result is unused.
  • DR 1102 removes "dereferencing the null pointer" as an example of UB. The given rationale is:

    There are core issues surrounding the undefined behavior of dereferencing a null pointer. It appears the intent is that dereferencing is well defined, but using the result of the dereference will yield undefined behavior. This topic is too confused to be the reference example of undefined behavior, or should be stated more precisely if it is to be retained.

  • This DR links to issue 232 where it is discussed to add wording that explicitly indicates *p as defined behavior when p is a null pointer, as long as the result is not used.

In conclusion:

p->non_static_mem_fn(); // UB by [class.mfct.non-static]/2
p->static_mem_fn();     // Defined behavior per issue 232 and 315.
cdhowie
  • 158,093
  • 24
  • 286
  • 300
  • 1
    Great explanation. For me, the key was the equivalence of the two code blocks. That was exactly the missing piece to connect all the dots. – Anton Menshov Aug 10 '20 at 01:25
  • 1
    @AntonMenshov Glad you found it useful! As usual when answering obscure C++ questions, I also learned several things while authoring the answer. – cdhowie Aug 10 '20 at 02:06
  • _There is nothing in the standard that states dereferencing a null pointer alone is UB_ Have you seen [expr.unary.op]/1? – Language Lawyer Aug 10 '20 at 10:23
  • @LanguageLawyer I have. I read it multiple times. There is nothing in that paragraph that mentions that the pointer must not be null, nor is there even anything saying that _the pointer must point at an object of the same type as the pointer is declared to point to._ – cdhowie Aug 10 '20 at 14:23
  • Ok. Ok. [expr.unary.op]/1: «the result is an lvalue referring to the object or function to which the expression points». As you wrote _«A null pointer doesn't point at any valid object»_ (idk what should "valid" mean, though, it just doesn't point to any object). What is the result of `*p` if `p` has a null pointer value? – Language Lawyer Aug 10 '20 at 14:32
  • Does it matter what the result of `*p` is if it's unused? That's the entire point. The section quoted does not indicate this is UB. The other citations in my answer directly support this interpretation. – cdhowie Aug 10 '20 at 14:52
  • _Does it matter what the result of *p is if it's unused? That's the entire point_ It is the point of CWG issues which has not been resolved yet. _The section quoted does not indicate this is UB._ It is explicit enough that the value of a pointer should be "pointer to some object/function". And for null pointer value, there is UB by omission. – Language Lawyer Aug 10 '20 at 14:55
  • 1
    There are multiple interpretations possible, and that's why there is discussion around making the intent more explicit. DR 1102 seems to support the interpretation that null pointer deref is _not always UB_ otherwise it would not have been removed as _the canonical example_ of UB. Note that [expr.unary.op]/1 does not indicate in any way that the pointed-to object must match the type of the pointer, either. This paragraph seems underspecified in multiple ways, in some cases the behavior is accepted as defined. – cdhowie Aug 11 '20 at 08:19
  • 1
    In the case of a mismatched pointer type, this is covered by [basic.lval]/11. This section talks specifically about "accessing the stored value of an object." This makes me wonder if this code is UB: `std::string foo; double *p = reinterpret_cast(&foo); *p;`. Is `*p` considered to be "accessing the stored value" if the result of the dereference is unused? If a deref alone is not enough to be considered accessing the stored value at the pointer target, this casts doubt on the claim that a null pointer deref alone is UB as well. – cdhowie Aug 11 '20 at 08:41
  • _Note that [expr.unary.op]/1 does not indicate in any way that the pointed-to object must match the type of the pointer, either_ So? It is fine if it doesn't match. I don't understand why you're mentioning this for the second time. What is the connection with null pointer values? – Language Lawyer Aug 11 '20 at 14:16
  • _In the case of a mismatched pointer type, this is covered by [basic.lval]/11_ [basic.lval] is about accesses, not application of the indirection operator. _This makes me wonder if this code is UB_ No, it is not UB. _Is *p considered to be "accessing the stored value"_ See [the definition of access](https://timsong-cpp.github.io/cppwp/n4861/defns.access). – Language Lawyer Aug 11 '20 at 14:19
  • _this casts doubt on the claim that a null pointer deref alone is UB as well_ Again. [expr.unary.op]/1: «the result is an lvalue referring to the object or function to which the expression points». Which object does the `*p` lvalue referring at if `p` has a null pointer value? – Language Lawyer Aug 11 '20 at 14:20
  • And if you want me to be noticed about your comments, you need to cast me. – Language Lawyer Aug 11 '20 at 14:22
  • 1
    @LanguageLawyer _"the result is an lvalue referring to the object or function to which the expression points"_ is... a defect. The committee is committed (heh) to allow null pointer dereferencing for the sake of `typeid(*p)` and `&*p` in CWG 232. It never was resolved but the intent is obvious. – Passer By Aug 17 '20 at 10:09