2

A question came up here on SO asking "Why is this working" when a pointer became dangling. The answers were that it's UB, which means it may work or not.

I learned in a tutorial that:

#include <iostream>

struct Foo
{
    int member;
    void function() { std::cout << "hello";}

};

int main()
{
    Foo* fooObj = nullptr;
    fooObj->member = 5; // This will cause a read access violation but...
    fooObj->function(); // Because this doesn't refer to any memory specific to
                        // the Foo object, and doesn't touch any of its members
                        // It will work.
}

Would this be the equivalent of:

static void function(Foo* fooObj) // Foo* essentially being the "this" pointer
{
    std::cout << "Hello";
    // Foo pointer, even though dangling or null, isn't touched. And so should 
    // run fine.
}

Am I wrong about this? Is it UB even though as I explained just calling a function and not accessing the invalid Foo pointer?

tkausl
  • 13,686
  • 2
  • 33
  • 50
Zebrafish
  • 11,682
  • 3
  • 43
  • 119
  • 3
    This is a topic fraught with debate. Examples of possible duplicates: https://stackoverflow.com/a/28483256/560648 https://stackoverflow.com/q/3498444/560648 https://stackoverflow.com/q/5248877/560648 Those questions largely focus on accessing static members, but accessing _no_ members is ultimately the same question – Lightness Races in Orbit Mar 10 '18 at 20:03
  • @Lightness Races in Orbit Should I assume then that nobody knows the real answer but I shouldn't play with fire? – Zebrafish Mar 10 '18 at 20:08
  • There is no _real_ answer, it's undefined, you can't possibly try to tie a specific behavior to something that's undefined behavior. – Hatted Rooster Mar 10 '18 at 20:09
  • @Zebra: Personally I think you can safely consider this to be UB, but that would be a reasonable fallback position yes – Lightness Races in Orbit Mar 10 '18 at 20:10
  • @SombreroChicken: Whether it has UB or not is (ostensibly) not entirely clear; that's the point – Lightness Races in Orbit Mar 10 '18 at 20:10
  • It is undefined whether it is undefined – Daniel Mar 10 '18 at 20:13
  • @LightnessRacesinOrbit: You agree that trying to make a null reference is UB at the point where the reference is formed, rather than later when it is used, right? – Ben Voigt Mar 10 '18 at 20:14
  • @LightnessRacesinOrbit Let's ask Bjarne? – Hatted Rooster Mar 10 '18 at 20:15
  • @BenVoigt: That's always been my opinion, yes, but I've met a surprising amount of disagreement on this subject – Lightness Races in Orbit Mar 10 '18 at 20:34
  • @LightnessRacesinOrbit: While I tend to agree that none of the Standard rules prohibit forming a bad lvalue reference (UB appears during lvalue to rvalue conversion), it is undeniable that the Standard itself states quite straightforwardly that null references cannot exist and attempting to summon one is dark UB magic. – Ben Voigt Mar 10 '18 at 20:38
  • @LightnessRacesinOrbit Is there really a disagreement ? I can even find a chain of duplicates stating the same thing. Starting from this one: https://stackoverflow.com/questions/11320822/why-does-calling-method-through-null-pointer-work-in-c – llllllllll Mar 10 '18 at 20:39
  • @liliscent: Look at my first comment on this thread – Lightness Races in Orbit Mar 10 '18 at 20:43
  • @BenVoigt: It does not state that normatively anywhere. Even if it did, that would not answer this question. – Lightness Races in Orbit Mar 10 '18 at 20:44
  • Except if the standard says that p->function() is equivalent to (*(p)).function() then that is dereferencing the pointer and that is UB.... I think. – Zebrafish Mar 10 '18 at 20:46

3 Answers3

9

You're reasoning about what happens in practice. Undefined behavior is allowed to do the thing you expect... but it is not guaranteed.

For the non-static case, this is straightforward to prove using the rule found in [class.mfct.non-static]:

If a non-static member function of a class X is called for an object that is not of type X, or of a type derived from X, the behavior is undefined.

Note that there's no consideration about whether the non-static member function accesses *this. The object is simply required to have the correct dynamic type, and *(Foo*)nullptr certainly does not.


In particular, even on platforms which use the implementation you describe, the call

fooObj->func();

gets converted to

__assume(fooObj); Foo_func(fooObj);

and is optimization-unstable.

Here's an example which will work contrary to your expectations:

int main()
{
    Foo* fooObj = nullptr;
    fooObj->func();
    if (fooObj) {
        fooObj->member = 5; // This will cause a read access violation!
    }
}

On real systems, this is likely to end up with an access violation on the commented line, because the compiler used the fact that fooObj can't be null in fooObj->func() to eliminate the if test following it.

Don't do things that are UB even if you think you know what your platform does. Optimization instability is real.


Also, the Standard is even more restrictive that you might think. This will also cause UB:

struct Foo
{
    int member;
    void func() { std::cout << "hello";}
    static void s_func() { std::cout << "greetings";}
};

int main()
{
    Foo* fooObj = nullptr;
    fooObj->s_func(); // well-formed call to static member,
         // but unlike Foo::s_func(), it requires *fooObj to be a valid object of type Foo
}

The relevant portions of the Standard are found in [expr.ref]:

The expression E1->E2 is converted to the equivalent form (*(E1)).E2

and the accompanying footnote

If the class member access expression is evaluated, the subexpression evaluation happens even if the result is unnecessary to determine the value of the entire postfix expression, for example if the id-expression denotes a static member.

This means that the code in question definitely evaluates (*fooObj), attempting to create a reference to a non-existent object. There have been several proposals to make this allowed and only forbid allowing lvalue->rvalue conversion on such a reference, but those have been rejected this far; even forming the reference is illegal in all versions of the Standard to date.

Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
  • Can you cite the almighty Standard? – Jive Dadson Mar 10 '18 at 20:06
  • @JiveDadson: Done. – Ben Voigt Mar 10 '18 at 20:12
  • In the first question link that Lightness provided the top voted answer says "TL;DR: Merely dereferencing a null pointer is not invoking UB. There is a lot of debate over this topic, which basically boils down to whether indirection through a null pointer is itself UB." I'm confused. Maybe I shouldn't have asked the question and life would have remained simpler. – Zebrafish Mar 10 '18 at 20:19
  • @Zebrafish: And in the comments of that answer, you find that Columbo based his answer not on the language rules, but on proposals (that were rejected) – Ben Voigt Mar 10 '18 at 20:21
  • Hmmm, really arcane and abstruse, but thanks. I'll consider it UB. It's funny because I first came across this from a respectable tutorial, anyone can be wrong I guess. – Zebrafish Mar 10 '18 at 20:23
  • @Zebrafish: The Standard itself says (although it is only a note): "[ Note: In particular, a null reference cannot exist in a well-defined program, because the only way to create such a reference would be to bind it to the “object” obtained by indirection through a null pointer, which causes undefined behavior. As described in 12.2.4, a reference cannot be bound directly to a bit-field. — end note ]" – Ben Voigt Mar 10 '18 at 20:24
  • huh, so this isn't just an artifact... I saw bug that was caused by similar situation that if() wasn't working in optimized version. Mostly because some lousy refactoring changed method from static to non-static ( see your comment below, a counter-productive situation) – Swift - Friday Pie Mar 10 '18 at 20:28
  • @Swift: But both the static and non-static cases are UB, if accessed through `->`. Totally possible that the compiler uses the UB in different ways though. – Ben Voigt Mar 10 '18 at 20:30
  • @BenVoigt Yes! But they dutifully changed from :: to -> after removing static.. But misplaced if() that checked if pointer is zero – Swift - Friday Pie Mar 10 '18 at 20:31
  • @Swift: Ahh, yes that would be a problem. – Ben Voigt Mar 10 '18 at 20:33
  • @Zebrafish: _"Maybe I shouldn't have asked the question and life would have remained simpler"_ :) – Lightness Races in Orbit Mar 10 '18 at 20:35
  • @BenVoigt: Unfortunately, we cannot necessarily use that as proof for two reasons: (a) it's non-normative, and (b) the alleged contradictions are reported to be in the form of standard defects in the first place. **But** personally I'm in agreement with you that how can the intent possibly be anything but this! – Lightness Races in Orbit Mar 10 '18 at 20:36
  • @LightnessRacesinOrbit: Got the clear rule addressing the non-static case, added to top of answer. – Ben Voigt Mar 10 '18 at 21:13
  • @Zebrafish: Life got simple, at least for the non-static case you asked about. – Ben Voigt Mar 10 '18 at 21:15
  • @BenVoigt: Continuing to play devil's advocate, because why not, I do not believe that wording is relevant. It tells us only what happens when a member function is invoked _on an object_ belonging to either of two categories (not of type `X`, or not of a type derived from `X`) — it does not tell us what happens when a member function is invoked on anything else (e.g. in this case nothing, or at least not-an-object). If no equivalent wording is found anywhere else for the not-an-object case, then we can at least conclude that there is UB by omission. – Lightness Races in Orbit Mar 10 '18 at 21:25
  • @LightnessRacesinOrbit: With `->` the "there is no object" argument falls flat. It's equivalent to `((*E1).E2)` and unary `*` tells us "the result is an lvalue referring to the object or function to which the expression points" Since `E1` has type `Foo*`, it's not a function pointer, so we can disregard the "or function" branch. The expression **must** point to an object. (And there's the prohibition on applying `*` to a null pointer) – Ben Voigt Mar 10 '18 at 22:36
  • @BenVoigt: Yes, that is much better reasoning for the purposes of forming a proof. Again by omission, the behaviour of `*` is undefined there as there is no "the object or function to which the expression points". But that other wording (bold in the answer) doesn't do the job. – Lightness Races in Orbit Mar 10 '18 at 22:38
  • @LightnessRacesinOrbit: Now, the quote I just gave from the Standard is a defect too, since the object lifetime rules clearly tell us that pointers and references bind to storage, not to objects. But there it is... – Ben Voigt Mar 10 '18 at 22:38
  • @BenVoigt: Full circle! If I remember correctly, that observation formed the basis of the arguments _against_ common sense, or at least the basis of some of them. – Lightness Races in Orbit Mar 10 '18 at 22:39
  • _"And there's the prohibition on applying * to a null pointer"_ Last I checked, there is no such normative thing - for this we must rely on the more general UB per the wording you just quoted - however I am a good four years out of date on standard developments so that may have changed, or I may be misremembering in the first place. Doesn't really matter of course. – Lightness Races in Orbit Mar 10 '18 at 22:40
  • @LightnessRacesinOrbit: Whatever hole might exist doesn't apply to null pointers, though, because they don't identify storage, nor a place where an object may be at other times (but its lifetime hasn't started or has already ended). And these "object will be here at other times" rules specifically forbid access to non-static members during the times when the object doesn't exist. – Ben Voigt Mar 10 '18 at 22:41
  • Anyway, now that we have recreated the entire history of the debate (despite not actually disagreeing!) I can go make dinner. :) – Lightness Races in Orbit Mar 10 '18 at 22:42
1

In practice this is usually how major compilers implement member functions, yes. This means that your test program would probably appear to run "just fine".

Having said that, dereferencing a pointer pointing to nullptr is undefined behavior which means that all bets are off and the whole program and it's output is meaningless, anything could happen.

You can never rely on this behavior, optimizers in particular could mess all of this code up because they're allowed to assume that fooObj is never nullptr.

Hatted Rooster
  • 35,759
  • 6
  • 62
  • 122
1

Compiler isn't obliged by standard to implement member function by passing it a pointer to the class instance. Yes, there is pseudo-pointer "this", but it is unrelated element, guaranteed to be "understood".

nullptr pointer doesn't point on any existing object, and -> () calls a member of that object. From standard's view, this is nonsense and result of such operation is undefined (and potentially, catastrophic).

If function() would be virtual, then call is allowed to fail, because address of function would be unavailable (vtable might be implemented as part of object and doesn't exist if object doesn't).

if the member function (method) behaves like that and meant to be called like that it should be a static member function (method). Static method doesn't access non-static fields and doesn't call non-static methods of class. If it is static, the call could look like this as well:

Foo::function(); 
Swift - Friday Pie
  • 12,777
  • 2
  • 19
  • 42
  • 1
    Access to a static member *could* look like that, but in C++ unlike some other languages, the `.` and `->` operators can also be used with static members (convenient in template code sometimes, when you either don't know if the thing is static, or the type is a pain in the neck to name but you have an instance handy) – Ben Voigt Mar 10 '18 at 20:17
  • @BenVoigt Yeah, "could" fits here better, static member is still a member – Swift - Friday Pie Mar 10 '18 at 20:20