0

I am implementing a modified doubly-linked data structure and while I was unit testing the functions, I encountered a Heisenbug that I can't get rid of.

Specific piece of code that I get the random segmentation fault is below:

     case CURSOR_LIST_COPY_CONSTRUCTOR:
      {
        // Test to construct a cursor list by using the copy constructor.
        if (verbose) {
          os << "\nCURSOR_LIST_COPY_CONSTRUCTOR:" << std::endl <<
            "Starting the construction operation:" <<
            std::endl;
        }
        anil::cursor_list my_cursor_list;
        my_cursor_list.append(1);
        anil::cursor_list my_copied_cursor_list = my_cursor_list;
        if (&my_copied_cursor_list == nullptr &&
            my_copied_cursor_list.index() != -1 &&
            my_copied_cursor_list.size() != 1 &&
            my_copied_cursor_list.front_data() != 1 &&
            my_copied_cursor_list.back_data() != 1) {
          if (verbose) {
            os << "Copy construction unsuccessful!" << std::endl;
          }
          return false;
        } else {
          if (verbose) {
            os << "Copy construction successful!" << std::endl;
          }
          return true;
        }
        return false;
        break;
      }

Two runs out of ten give me a segmentation fault at my_copied_cursor_list.front_data() != 1. Initially I couldn't get to replicate this bug inside gdb but after using the command set disable-randomization off, which I learned from an answer to the question segfault only when NOT using debugger, inside gdb, I was able to replicate the same bug.

The source code for the front data is as following:

int anil::cursor_list::front_data() {
  if (this != nullptr && this->is_empty() == false) {
    return this->front->data;
  }
}

I tried to put return this->front->data; into an if statement in order to test if this->front was somehow nullptr; however, even when I did it, it still somehow caused a segmentation fault. This was what I tried and failed:

int anil::cursor_list::front_data() {
  if (this != nullptr && this->is_empty() == false) {
    if (this->front != nullptr) {
      return this->front->data;
    }
  }
}

The code for the append() function is as following:

void anil::cursor_list::append(int new_data) {
  if (this != nullptr) {
    cursor_list_node* new_node = new cursor_list_node;
    new_node->data = new_data;
    new_node->next = nullptr;
    new_node->previous = this->back;
    if (this->is_empty() == false) {
      this->back->next = new_node;
    } else {
      this->front = new_node;
    }
    this->back = new_node;
    ++this->m_size;
  }
}

The code for the class declaration is as following:

#ifndef ANIL_CURSOR_LIST_H
#define ANIL_CURSOR_LIST_H

#include <cstddef>
#include <iostream>

namespace anil {
  class cursor_list_node {
    private:
      int data;
      cursor_list_node* next;
      cursor_list_node* previous;
      friend class cursor_list; 
  };

  class cursor_list {
    private:

      // Data:
      int m_index;
      int m_size;
      cursor_list_node* front;
      cursor_list_node* back;
      cursor_list_node* cursor;

      // Functions:
      void delete_list();

    public:
      cursor_list() : m_index(-1), m_size(0), front(nullptr), back(nullptr),
        cursor(nullptr) {}
      cursor_list(cursor_list& copied_list);
      bool is_empty();
      int size();
      int index();
      int front_data();
      int back_data();
      int cursor_data();
      bool operator==(cursor_list& rhs); // rhs = right hand side
      cursor_list& operator= (cursor_list& rhs);  // rhs = right hand side
      friend std::ostream& operator<<(std::ostream& out, const cursor_list& rhs); // rhs = right hand side
      void clear();
      void move_cursor_front();
      void move_cursor_back();
      void move_cursor_prev();
      void move_cursor_next();
      void prepend(int new_data);
      void append(int new_data);
      void insert_before_cursor(int new_data);
      void insert_after_cursor(int new_data);
      void delete_front();
      void delete_back();
      void delete_cursor();
      ~cursor_list();
  };
}

#endif /* ANIL_CURSOR_LIST_H */

The segmentation fault message inside gdb is as following:

Program received signal SIGSEGV, Segmentation fault.
0x00000000004018c8 in anil::cursor_list::front_data (this=0x7ffe055100c0)
    at anil_cursor_list.cpp:82
82      return this->front->data;

My operating system is as following:

Distributor ID: Ubuntu
Description:    Ubuntu 16.04.6 LTS
Release:        16.04
Codename:       xenial

Lastly, the entire source code for the data structure can be found at https://github.com/Karipso/Karipso-Public/blob/master/cpp_practice_codes/data_structures/cursor_list/anil_cursor_list.cpp and https://github.com/Karipso/Karipso-Public/blob/master/cpp_practice_codes/data_structures/cursor_list/anil_cursor_list.h.

Similarly, the source code for the unit tests can be found at: https://github.com/Karipso/Karipso-Public/blob/master/cpp_practice_codes/data_structures/cursor_list/main.cpp

Finally, the makefile is located at https://github.com/Karipso/Karipso-Public/blob/master/cpp_practice_codes/data_structures/cursor_list/Makefile

Can someone help me determine what this bug is caused by?

Update: I tried to run the program using gdb line by line using next and step and printed this and this->front before the seg fault. This is the result:

Breakpoint 1, run_tests (os=..., bst_test=1, verbose=false) at main.cpp:121
121         anil::cursor_list my_copied_cursor_list = my_cursor_list;
(gdb) next
138         if (my_copied_cursor_list.index() != -1) {}
(gdb) 
139         if (my_copied_cursor_list.size() != 1) {}
(gdb) 
140         if (my_copied_cursor_list.front_data() != 1) {}
(gdb) step
anil::cursor_list::front_data (this=0x7ffc683c4500) at anil_cursor_list.cpp:79
79    if (this != nullptr && this->is_empty() == false) {
(gdb) next
80      return this->front->data;
(gdb) print this
$7 = (anil::cursor_list * const) 0x7ffc683c4500
(gdb) print this->front
$8 = (anil::cursor_list_node *) 0xe92cb976bc5a2900
(gdb) next

Program received signal SIGSEGV, Segmentation fault.
0x0000000000401956 in anil::cursor_list::front_data (this=0x7ffc683c4500)
    at anil_cursor_list.cpp:80
80      return this->front->data;

Update-2: I think, I was able to get rid of the Heisenbug. After adding the initializations for all member variables of the cursor_list class, the program stopped giving seg-faults. So, I updated the copy constructor to be as following:

  this->m_index = -1;
  this->m_size = 0;
  this->front = nullptr;
  this->back = nullptr;
  this->cursor = nullptr;
  if (copied_cursor_list.is_empty() == false) {
    for (cursor_list_node* it = copied_cursor_list.front; it != nullptr;
         it = it->next) {
      this->append(it->data);
    }
  }

I will do some experimentation to figure out why append() function wasn't properly initializing the front pointer and update the post if I can figure out the exact cascade of events that lead to that faulty initialization or lack of initialization.

Anil Celik Maral
  • 177
  • 1
  • 12
  • I'd compile the code with debug information (clang: `-g`), turn on all warnings (clang: `-Weverything -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-padded -Wno-c99-compat -Wno-poison-system-directories`), and enable the code sanitizers (clang: `-fsanitize=undefined,null`), and possibly debug-friendly optimize (clang: `-Og`). – Eljay Jun 30 '21 at 13:50
  • 1
    Can you include your `append()` function in the question and an appropriate snippet of your class definition, please. That will help make the question self-contained. – Daniel Dearlove Jun 30 '21 at 14:00
  • @DanielDearlove will do! – Anil Celik Maral Jun 30 '21 at 14:01
  • 1
    Do you know how to use gdb? Try `print this->front;` - is it null? You can also `print this;`, `print this->front->data;`, print anything you want really – user253751 Jun 30 '21 at 14:05
  • @user253751 I posted an update that includes printing both `this` and `this->front` before the seg fault. – Anil Celik Maral Jun 30 '21 at 14:35
  • @Eljay I am not sure how and where to include all of those lines in my Makefile. I keep getting errors. Could you please direct me? :) – Anil Celik Maral Jun 30 '21 at 14:41
  • 1
    What are these `this != nullptr` conditions for? They look completely unnecessary. If you are in that member function (`append()`, `front_data()`, etc) then there is an instantiated object. – Daniel Dearlove Jun 30 '21 at 14:46
  • 1
    `&my_copied_cursor_list == nullptr` is never true, and `this != nullptr` is never false. – molbdnilo Jun 30 '21 at 14:46
  • `this->front` is an invalid non-null pointer. – molbdnilo Jun 30 '21 at 14:49
  • @molbdnilo Why would this->front get assigned an invalid non-null pointer? Any insights? – Anil Celik Maral Jun 30 '21 at 14:54
  • 1
    @AnilCelikMaral there must be a lot of compiler warnings. Simplifying your code in the ways already suggested Yakk - Adam Nevraumont should reduce your problems. If you set your compiler to be pedantic, as suggested by Eljay, and you clean them up then you will probably solve a lot of your problems. – Daniel Dearlove Jun 30 '21 at 14:54
  • @DanielDearlove I am trying to but as I mentioned earlier, I am not sure where and how to include them in my Makefile. I keep getting unrecognized command line option errors. – Anil Celik Maral Jun 30 '21 at 14:58
  • 1
    @AnilCelikMaral if you are using a hand-written Makefile, [CXXFLAGS](https://www.gnu.org/software/make/manual/html_node/Implicit-Variables.html) is an implicit variable and you can append your flags to it, `CXXFLAGS += -Wall` on its own line for example. That should add the flags to the compiler when you build your application. This is off-topic so you should ask a separate question if this does not work. – Daniel Dearlove Jun 30 '21 at 15:10
  • 1
    @AnilCelikMaral Note, your copy constructor does not initialize any of the variables prior to entering the loop which appends the new elements. A copy constructor does not automatically invoke the default constructor first, and you must still initialize all variables correctly. – Dave S Jun 30 '21 at 15:16
  • 1
    @AnilCelikMaral The cause is not that it gets assigned that value, but that you leave it uninitialized and never assign it a valid value at all. It's just bad luck that it happens to work sometimes. – molbdnilo Jun 30 '21 at 16:42
  • @DaveS Your comments solved my problem, I wish I could accept it as an answer :D – Anil Celik Maral Jul 04 '21 at 12:58

1 Answers1

3
if (this != nullptr)

You aren't allowed to check that. Compilers can and will change this to if(true).

Calling nullptr->blah() is already UB, so within blah the compiler is free to assume nullptr is not null.

int anil::cursor_list::front_data

Not all return paths return a value. Falling off the end is UB.

The compiler is free to remove your if checks guarding the return path, because segfaulting is allowed in UB.

Fixing this might not fix your crash, but with UB in your code, compilers are permitted to generate code that crashes in nonsense ways.

To fix your problem, you should make a much much simpler linked list with fewer operations.

Then unit test that.

Then wrap that simpler linked list in your rich API. And unit test your rich API.

Then when you have a crash in your rich API unit tests, you can write up a minimal unit test in your simple API.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
  • Damn! You beat me to it. I was about to point out that some code paths do not have a 'return' statement. Next time. ;-) – Daniel Dearlove Jun 30 '21 at 14:51
  • Could you let me know what UB stands for? I will try to simplify the code and do some testing that way. Thanks for pointing out the unnecessary checks. – Anil Celik Maral Jun 30 '21 at 14:55
  • You *are* allowed to check that, but the compiler is allowed to say that it's always true so why did you bother checking it? – user253751 Jun 30 '21 at 15:15
  • 1
    @AnilCelikMaral UB -> [Undefined Behavior](https://en.cppreference.com/w/cpp/language/ub) (the plague of C++ developers). You might expect that a broken code is denied by the compiler or, at least, crashes at the resp. point. Unfortunately, neither nor is granted... – Scheff's Cat Jun 30 '21 at 15:28
  • @user253751 I didn't know any better :/ – Anil Celik Maral Jul 01 '21 at 13:13