1

For a strictly internal class that is not intended to be used as part of an API provided to an external client, is there anything inherently evil with initializing a class pointer member variable to itself rather than NULL or nullptr?

Please see the below code for an example.

#include <iostream>

class Foo
{
public:
  Foo() :
    m_link(this)
  {
  }

  Foo* getLink()
  {
    return m_link;
  }

  void setLink(Foo& rhs)
  {
    m_link = &rhs;
    // Do other things too.
    // Obviously, the name shouldn't be setLink() if the real code is doing multiple things,
    // but this is a code sample.
  }

  void changeState()
  {
    // This is a code sample, but play along and assume there are actual states to change.
    std::cout << "Changing a state." << std::endl;
  }

private:
  Foo* m_link;
};

void doSomething(Foo& foo)
{
  Foo* link = foo.getLink();

  if (link == &foo)
  {
    std::cout << "A is not linked to anything." << std::endl;
  }

  else
  {
    std::cout << "A is linked to something else. Need to change the state on the link." << std::endl;
    link->changeState();
  }
}

int main(int argc, char** argv)
{
  Foo a;
  doSomething(a);

  std::cout << "-------------------" << std::endl;

  // This is a mere code sample.
  // In the real code, I'm fetching B from a container.
  Foo b;
  a.setLink(b);
  doSomething(a);

  return 0;
}

Output

A is not linked to anything.
-------------------
A is linked to something else. Need to change the state on the link.
Changing a state.

Pros

The benefit to initializing the pointer variable, Foo::link, to itself is to avoid accidental NULL dereferences. Since the pointer can never be NULL, then at worst, the program will produce erroneous output rather than segmentation fault.

Cons

However, the clear downside to this strategy is that it appears to be unconventional. Most programmers are used to checking for NULL, and thus don't expect to check for equality with the object invoking the pointer. As such, this technique would be ill-advised to use in a codebase that is targeted for external consumers, that is, developers expecting to use this codebase as a library.

Final Remarks

Any thoughts from anyone else? Has anyone else said anything substantial on this subject, especially with C++98 in consideration? Note that I compiled this code with a GCC compiler with these flags: -std=c++98 -Wall and did not notice any issues.

P.S. Please feel free to edit this post to improve any terminology I used here.

Edits

  • This question is asked in the spirit of other good practice questions, such as this question about deleting references.
  • A more extensive code example has been provided to clear up confusion. To be specific, the sample is now 63 lines which is an increase from the initial 30 lines. Thus, the variable names have been changed and therefore comments referencing Foo:p should apply to Foo:link.
FriskySaga
  • 409
  • 1
  • 8
  • 19
  • 1
    "at worst, the program will produce erroneous output rather than segmentation fault" is that actually beneficial? Read up on the idea of fail-fast. – TheUndeadFish Jan 02 '22 at 23:31
  • @user17732522 The purpose is to prevent other programmers, including myself, from forgetting to check NULL and subsequently causing a segmentation fault. This microservice absolutely cannot crash. Erroneous output is bad too, but it's better than crashing. – FriskySaga Jan 02 '22 at 23:32
  • 2
    Assuming you don't forget the copy constructor/assign operator, perhaps just by deleting them. I found this which uses this pattern: https://en.wikipedia.org/wiki/Sentinel_node#Second_version_using_a_sentinel_node (although it isn't the best example), and I vaguely remember some data structures that do this. It is similar to the [null object pattern](https://en.wikipedia.org/wiki/Null_object_pattern) (e.g., if it was a singly-linked-list, you can have `size_t length() const { return p == this ? 0 : 1 + p->length() }`, and not have to worry about calling a member function on nullptr) – Artyer Jan 02 '22 at 23:33
  • @user17732522 So let's say `p` is initialized to `NULL` instead. Then every time I need to retrieve `p`, I need to check for `NULL`. If I forget to check for `NULL`, my program crashes. Of course I will have unit tests in place to make sure my program is bug-free, but on the off-chance I miss something, I really don't want my program to crash. – FriskySaga Jan 02 '22 at 23:36
  • If you are not confident about your pointers, and you should be... Why not just `try` and `catch`? – lakeweb Jan 02 '22 at 23:40
  • my 2 cents,, I'd use a [RAII](https://en.wikipedia.org/wiki/Resource_acquisition_is_initialization) architecture and throw an exception in the constructor if something went wrong. If ctor doesn't throw, you can assume everything is good for the lifetime of the object. – yano Jan 02 '22 at 23:42
  • @lakeweb NULL dereferences and anything else inducive of a segmentation fault cannot be caught. See [this](https://stackoverflow.com/questions/1823721/how-to-catch-the-null-pointer-exception) and [this](http://www.cplusplus.com/forum/general/5957/). – FriskySaga Jan 02 '22 at 23:42
  • It is good that it crashes as you then find your defects quickly. If you forgot to check if the a.p == &a and get wrong answers because of that then it is lot harder to track down. – Öö Tiib Jan 02 '22 at 23:43
  • @yano `p` will be changed. The point isn't to have `p` be permanently set to itself. I will edit the post since many are assuming this code sample is the complete code, and that I am really using a public member variable without providing any getters and setters. – FriskySaga Jan 02 '22 at 23:44
  • @ÖöTiib I understand your point, but for production code, we really need this service to be robust. – FriskySaga Jan 02 '22 at 23:45
  • Going to edit the post once I get a chance, but all in all, this question is asked in the spirit of other good practice questions, such as this question about [deleting references](https://stackoverflow.com/questions/3233987/deleting-a-reference). – FriskySaga Jan 02 '22 at 23:47
  • Yes, there is that. But ether way, if you don't have a valid pointer, you don't. If the object is pointing to itself then you will hang if you iterate. How can that be better than a crash? – lakeweb Jan 02 '22 at 23:56
  • Making defects hard to find does not improve robustness and is questionable practice. Objective C has its nil that was designed to make null checks unneeded and IMHO it did cause more confusion than help. – Öö Tiib Jan 02 '22 at 23:56
  • 1
    @Frisky - I guess this depends on your application domain. I used to work for a bank, and there we aborted the transaction on the slightest *suspicion* of an incorrect result. *Absolutely anything* was better than showing the customer an incorrect balance for an account. YMMV, and all that. – BoP Jan 03 '22 at 00:00
  • @BoP Okay that make sense. I see your use case. Thanks for the input. – FriskySaga Jan 03 '22 at 00:18
  • @Artyer Thank you, I will give those articles a thorough read. As for the assignment operator and copy constructor, I was trying to keep the code sample short and trivial. Seems like that confused more people, so I expanded the code sample a bit more since you last commented. That said, I think using the default assignment operator and copy constructor is fine because ultimately I want a shallow copy of the pointers. – FriskySaga Jan 03 '22 at 00:37
  • @TheUndeadFish As you suggested, I read up on fail-fast, and honestly I'm astonished to learn that SpaceX is such a huge champion of approach. That said, I can't help but think that they do need to limit their usage of the concept of fail-fast to unmanned missions, relatively inexpensive prototypes, and non-safety critical systems. Anyways, reading that and reading other responses, especially the response written by Passer By, helped to pry me out of my "never-ever-crash" mindset. My service isn't safety-critical, so I'll take the L and crash. – FriskySaga Jan 03 '22 at 02:28

3 Answers3

2

It's a bad idea to start with, but a horrendous idea as a solution to null dereferences.

You don't hide null dereferences. Ever. Null dereferences are bugs, not errors. When bugs happens, all invariances in your program goes down the toilet and there can be no guarantee for any behaviour. Not allowing a bug to manifest itself immediately doesn't make the program correct in any sense, it only serves to obfuscate and make debugging significantly more difficult.


That aside, a structure pointing into itself is a gnarly can of worms. Consider your copy assignment

Foo& operator=(const Foo& rhs) {
    if(this != &rhs)
        return *this;
    if(rhs->m_link != &rhs)
        m_link = this;
    else
        m_link = rhs->m_link;
}

You now have to check whether you're pointing to yourself every time you copy because its value is possibly tied to its own identity.

As it turns out, there's plenty of cases where such checks are required. How is swap supposed to be implemented?

void swap(Foo& x, Foo& y) noexcept {
    Foo* tx, *ty;
    if(x.m_link == &x)
        tx = &y;
    else
        tx = x.m_link;
    if(y.m_link == &y)
        ty = &x;
    else
        ty = y.m_link;

    x.m_link = ty;
    y.m_link = tx;
}

Suppose Foo has some sort of pointer/reference semantics, then your equality is now also non-trivial

bool operator==(const Foo& rhs) const {
    return m_link == rhs.m_link || (m_link == this && rhs.m_link == &rhs);
}

Don't point into yourself. Just don't.

Passer By
  • 19,325
  • 6
  • 49
  • 96
  • Thanks for diving deep into this answer. I'm glad you brought up the topic of copying, swapping, and equality. In my heart, I knew I wanted shallow copying, but I can see how other programmers might think differently and think they need a deep copy. And certainly if that was the case, this self assignment would be very nasty. But you convinced me when you said: "When bugs happens, all invariances in your program goes down the toilet and there can be no guarantee for any behaviour.". I don't think that's what I want, so I will use the NULL checks. +1 for you. – FriskySaga Jan 03 '22 at 02:05
  • @FriskySaga That's not a problem of shallow or deep copy. An empty `Foo` cannot become non-empty when copied or swapped. – Passer By Jan 03 '22 at 05:14
0

Foo is responsible for its own state. Especially pointers it exposes to its users.

If you expose a pointer in this fashion, as a public member, it is a very odd design decision. My gut has told me the last 30 odd years a pointer like this is not a responsible way to handle Foo's state.

Consider providing getters for this pointer instead.

Foo* getP() {
    // create a safe pointer for user
    // and indicate an error state. (exceptions might be an alternative)
}

Unless you share more context what Foo is, advice is hard to provide.

Captain Giraffe
  • 14,407
  • 6
  • 39
  • 67
  • The pointer is actually a private member, and what is being exposed it to the public interface is the getter. However, I chose to not show the getter in favor of brevity. In fact, my initial code sample did not show the include, and I squeezed the entire initializer list into one line. But I figured I'd improve the readability, and expanded the initializer list to multiple lines. And I added the include in case someone wanted to copy paste my code sample and run it themselves. But thank you for your input and pointing that out. You are correct that it is unusual to not provide a getter. – FriskySaga Jan 02 '22 at 23:39
  • I ran out of characters on the previous comment, but I mean to say the pointer is a private member in my actual code. Of course, here in the code sample, it is a public member. I can edit my post to show it as a private member to clear up any confusion. – FriskySaga Jan 02 '22 at 23:40
  • @FriskySaga then My first sentence applies . – Captain Giraffe Jan 02 '22 at 23:42
0

is there anything inherently evil with initializing a class pointer member variable to itself rather than NULL or nullptr?

No. But as you pointed out, there might be different considerations depending on the use case.

I'm not sure this would be relevant under most circumstances, but there are some instances where an object needs to hold a pointer of its own type, so its really just pertinent to those cases.

For instance, an element in a singly-linked list will have a pointer to the next element, so the last element in the list would normally have a NULL pointer to show there are no further elements. So using this example, the end element could instead point to itself instead of NULL to denote it is the last element. It really just depends on personal implementation preference.

Many times, you can end up obfuscating code needlessly when trying too hard to make it crash-proof. Depending on the situation, you might mask issues and make problems much harder to debug. For instance, going back to the singly-linked example, if the pointer-to-self initialization method is used, and a bug in the program attempts to access the next element from the end element in the list, the list will return the end element again. This would most likely cause the program to continue "traversing" the list for eternity. That might be harder to find/understand than simply letting the program crash and finding the culprit via debugging tools.

Sam
  • 109
  • 5
  • I like this answer a lot, and I really appreciate the effort and time you've put into constructing this answer. For my use case, I will never need to iterate, but now you have me wondering if by any chance I may ever have a chain of method calls that will end up recursively calling each other with no end condition. The actual code is complex and consists of thousands of lines, so I'm going to have to rethink whether it's still worth making crash-proof, but I'll come back to this question in a little while and mark an accepted answer. – FriskySaga Jan 03 '22 at 00:27
  • Also really good idea on your part to cross-reference linked lists. I was racking my brain while responding back to @Passer By, and wondering if it was ever acceptable or commonplace to store a pointer of the same class type as a member variable. But in fact, all linked list implementations do this. – FriskySaga Jan 03 '22 at 02:34