3

I have a Token class which is structured as follows:

class Token {
    public:
        void* value;
        token_type type; // enum to know which type of data is it storing inside value
        unsigned int line;

        Token() = default;

        Token(void* new_value, token_type new_type):
            value(new_value), type(new_type)
            {}

        ~Token() {
            //free the memory occupied by the object pointed to by value based on it's type
            //this also handles the case of the Token being an instantiation of Statement
        }
};

Then there's the class Statement:

class Statement: public Token {
    public:
        std::vector<Token*>* list;
        unsigned int lenght = 0;

        Statement() {
            list = new std::vector<Token*>;
            Token((void*) list, statement);
        }
};

Basically, when creating a Statement, the inner Token knows it's holding a statement because there is a specific token_type type for that. The inner Token does the cleanup of the vector in its destructor so it has to have a pointer to that vector in its value attribute, but we also have a copy of that pointer in the Statement so we don't need to do the cast from void* to std::vector<Token>* each time;

Now, what I am trying to do is this:

std::string* value = new std::string("Sample text");
Token* to_be_pushed = new Token((void*) value, string); //the object pointed to by value will be deleted in this Token's destructor

Statement* new_statement = new Statement;

new_statement->list->push_back(to_be_pushed);

delete new_statement; //Token destructor gets called; It knows it's a statement,
//so it knows value is pointing to a std::vector<Token*> object, and it deletes each pointer in that vector and then the vector itself

The problem, however, is I am getting a Segmentation fault error on the line where I push to_be_pushed at the end of new_statement->list.

I've tried breaking that line in two pieces, and I know the problem is when I am calling list->push_back, not when accessing new_statement->list.

Here's the backtrace I got from gdb:

#0  0xb6e51410 in memmove ()
from /system/lib/libc.so
#1  0x2a0066f8 in std::__copy_move<true, true, std::random_access_iterator_tag>::__copy_m<Token*>
(__first=0x2a0198d0, __last=0x0,
__result=0x2a019908)
at /data/data/com.termux/files/usr/include/bits/stl_algobase.h:378
#2  0x2a006640 in std::__copy_move_a<true, Token**, Token**> (__first=0x2a0198d0, __last=0x0,
__result=0x2a019908)
at /data/data/com.termux/files/usr/include/bits/stl_algobase.h:395
#3  0x2a007070 in std::__copy_move_a2<true, Token**, Token**> (__first=0x2a0198d0, __last=0x0,
__result=0x2a019908)
at /data/data/com.termux/files/usr/include/bits/stl_algobase.h:432
#4  0x2a007010 in std::copy<std::move_iterator<Token**>, Token**> (__first=..., __last=...,
__result=0x2a019908)
at /data/data/com.termux/files/usr/include/bits/stl_algobase.h:464
#5  0x2a006fb0 in std::__uninitialized_copy<true>::__uninit_copy<std::move_iterator<Token**>, Token**> (__first=..., __last=...,
__result=0x2a019908)
at /data/data/com.termux/files/usr/include/bits/stl_uninitialized.h:93
#6  0x2a006f70 in std::uninitialized_copy<std::move_iterator<Token**>, Token**> (__first=...,
__last=..., __result=0x2a019908)
at /data/data/com.termux/files/usr/include/bits/stl_uninitialized.h:123
#7  0x2a006eec in std::__uninitialized_copy_a<std::move_iterator<Token**>, Token**, Token*> (
__first=..., __last=...,
__result=0x2a019908)
at /data/data/com.termux/files/usr/include/bits/stl_uninitialized.h:279
#8  0x2a006dc0 in std::__uninitialized_move_if_noexcept_a<Token**, Token**, std::allocator<Token*> > (__first=0x2a0198d0, __last=0x0,
__result=0x2a019908, __alloc=...)
at /data/data/com.termux/files/usr/include/bits/stl_uninitialized.h:300
#9  0x2a007264 in std::vector<Token*, std::allocator<Token*> >::_M_emplace_back_aux<Token* const&> (this=0x2a019908,
__args=@0x2a0198e8: 0x2a0198d0)
at /data/data/com.termux/files/usr/include/bits/vector.tcc:457
#10 0x2a005b70 in std::vector<Token*, std::allocator<Token*> >::push_back (this=0x2a019908,
__x=@0x2a0198e8: 0x2a0198d0)
at /data/data/com.termux/files/usr/include/bits/stl_vector.h:1049 

Why is this happening? What am I doing wrong? Is it the code I've posted fault?

Dan Mašek
  • 17,852
  • 6
  • 57
  • 85
user6245072
  • 2,051
  • 21
  • 34
  • `Token((void*) list, statement);` You seem to think that calls the base class constructor. It does not: the base class is initialized with the default constructor. Instead, it creates a temporary unnamed instance of `Token`, which is immediately destroyed. In other words, it's a no-op. This is further exacerbated by the fact that `Token()` default constructor leaves its members uninitialized, containing random garbage. – Igor Tandetnik Oct 09 '16 at 16:55
  • @Igor Tandetnik how could I achieve the right behaviour, i.e. storing in the Token's `value` and in the Statement's `list` the same pointer to a new `std::vector` object? The pointer is however already being stored the right way in `list`, so my bug is independent on that. Thanks for pointing it out. – user6245072 Oct 09 '16 at 16:59
  • You haven't shown `~Token()` destructor, but presumably it tries to access its member variables, which are, again, uninitialized and containing random garbage; whereupon your program exhibits undefined behavior. I'm pretty sure the problem I've pointed out is in fact directly relevant to the behavior you observe. – Igor Tandetnik Oct 09 '16 at 17:02
  • @Igor Tandetnik ok then, fixing that and seeing what happens. – user6245072 Oct 09 '16 at 17:03

1 Answers1

4
Token((void*) list, statement);

You are expecting this to call the superclass's constructor. This does not call the constructor. All this does is construct a temporary object, which then gets immediately destroyed. The only way to call the superclass's constructor is in the initialization section of the subclass:

 Statement() : Token(...)

However in your case, you need to initialize the subclass, namely its list member, before invoking the constructor for the superclass. This cannot be easily done. Although there are ways around this, this is really a symptom of a fundamental design flaw with this class hierarchy.

You have two options:

  1. Reimplement your class hierarchy. The way your classes are designed are fundamentally wrong. Properly-designed C++ code should never need to do tricks like casting to void *.

  2. Initialize Token manually, in the body of the Statement's constructor. Use Token's default constructor, then fix it up in the Statement's constructor body.

However, even if you try the #2 approach, you will likely find other problems later down the road, specifically: your Statement class violates the Rule Of Three. That's almost a guaranteed certainty to cause a number of difficult to track down bugs.

The right answer here will be to step back, and redesign your class hierarchy completely. And getting rid of new allocation in favor of proper usage of C++ library containers would also be a plus.

There are many ways to redesign this class hierarchy, and without additional information it is not possible to suggest the right class design.

Community
  • 1
  • 1
Sam Varshavchik
  • 114,536
  • 5
  • 94
  • 148
  • This fixed it. Thanks for your answer. I am just a beginner with proper OOP (i come from Lua) so dirty code is normal I hope. Can you show me some links on how to get rid of `void*` please? They keep telling me it's wrong but I can't find online a better way to do what `void*` does for me perfectly: storing a pointer to an object of unknown type. – user6245072 Oct 09 '16 at 17:12
  • "The inner Token does the cleanup of the vector in its destructor" This would be just one of the problems. `Statement` constructed it, so `Statement` should destroy it. That's proper class design. Furthermore, there's nothing "unknown" about it. Its type is always the same. – Sam Varshavchik Oct 09 '16 at 17:14
  • If I think about it, Statement doesn't need to be a subclass of Token. However yes, the type of `value` is unknown in advance because it could be storing a `std::string` and thus having a `type` attribute containing `token_type::string`, or it could contain a double whith `token_type::number` and so on. – user6245072 Oct 09 '16 at 17:18
  • 1
    ... and the right solution would be to define a base class, and derive each possible value of the metadata used by `token`, and store a `std::shared_ptr` to the base class. Again, there's no need for a `void *` here. – Sam Varshavchik Oct 09 '16 at 17:19
  • Can you elaborate on that last bit some more please? Again, just a link would be helpful. – user6245072 Oct 09 '16 at 17:23
  • This cannot exactly be explained in a few paragraphs, [find a good C++ book](http://stackoverflow.com/questions/388242/the-definitive-c-book-guide-and-list). – Sam Varshavchik Oct 09 '16 at 17:27
  • Did you mean having a single class for each of the subtypes of Token (like, one for `token_type::string` storing a `std::string` and bla bla bla)? I don't see how this solves the problem, as I will anyway need to store them as a Token* somewhere. – user6245072 Oct 09 '16 at 17:29