Using std::unique_ptr of a polymorphic class as key in std::unordered_map

Question

My problem comes from a project that I'm supposed to finish. I have to create an std::unordered_map<T, unsigned int> where T is a pointer to a base, polymorphic class. After a while, I figured that it will also be a good practice to use an std::unique_ptr<T> as a key, since my map is meant to own the objects. Let me introduce some backstory:

Consider class hierarchy with polymorphic sell_obj as a base class. book and table inheriting from that class. We now know that we need to create a std::unordered_map<std::unique_ptr<sell_obj*>, unsigned int>. Therefore, erasing a pair from that map will automatically free the memory pointed by key. The whole idea is to have keys pointing to books/tables and value of those keys will represent the amount of that product that our shop contains.

As we are dealing with std::unordered_map, we should specify hashes for all three classes. To simplify things, I specified them in main like this:

namespace std{
    template <> struct hash<book>{
        size_t operator()(const book& b) const
        {
            return 1; // simplified
        }
    };

    template <> struct hash<table>{
        size_t operator()(const table& b) const
        {
            return 2; // simplified
        }
    };
    // The standard provides a specilization so that std::hash<unique_ptr<T>> is the same as std::hash<T*>.
    template <> struct hash<sell_obj*>{
        size_t operator()(const sell_obj *s) const
        {
            const book *b_p = dynamic_cast<const book*>(s);
            if(b_p != nullptr) return std::hash<book>()(*b_p);
            else{
                const table *t_p = static_cast<const table*>(s);
                return std::hash<table>()(*t_p);
            }
        }
    };
}

Now let's look at implementation of the map. We have a class called Shop which looks like this:

#include "sell_obj.h"
#include "book.h"
#include "table.h"

#include <unordered_map>
#include <memory>

class Shop
{
    public:
        Shop();

        void add_sell_obj(sell_obj&);
        void remove_sell_obj(sell_obj&);

    private:
        std::unordered_map<std::unique_ptr<sell_obj>, unsigned int> storeroom;

};

and implementation of two, crucial functions:

void Shop::add_sell_obj(sell_obj& s_o)
{
    std::unique_ptr<sell_obj> n_ptr(&s_o);
    storeroom[std::move(n_ptr)]++;
}

void Shop::remove_sell_obj(sell_obj& s_o)
{
    std::unique_ptr<sell_obj> n_ptr(&s_o);
    auto target = storeroom.find(std::move(n_ptr));
    if(target != storeroom.end() && target->second > 0) target->second--;
}

in my main I try to run the following code:

int main()
{

    book *b1 = new book("foo", "bar", 10);
    sell_obj *ptr = b1;

    Shop S_H;
    S_H.add_sell_obj(*ptr); // works fine I guess

    S_H.remove_sell_obj(*ptr); // usually (not always) crashes [SIGSEGV]

    return 0;
}

my question is - where does my logic fail? I heard that it's fine to use std::unique_ptr in STL containters since C++11. What's causing the crash? Debugger does not provide any information besides the crash occurance.

If more information about the project will be needed, please point it out. Thank you for reading

The whole point of `unique_ptr` is that they have unique ownership. You can't have two of them owning the same thing. Making one from `&s_o` was sketchy to begin with and now it's bitten you. — chris, Apr 12 '17 at 12:45
where do I suggested that I want to have 2 `unique_ptr` owning the same object? I can see why it *could* bee seen as sketchy but how else am I supposed to achieve similar effect? — Fureeish, Apr 12 '17 at 12:46
You have two lines there both saying "I'm going to own `s_o` now." Naturally, when `s_o` is the same object both times, or if one is called on the same object twice, they aren't very unique. Considering your functions have no way of telling whether it will be unique, you should rethink the API. — chris, Apr 12 '17 at 12:49
Sounds fair. Any idea how I could add/remove the objects then? — Fureeish, Apr 12 '17 at 12:51
And implement `hash` as a virtual function in `sell_obj` (overridden in your derived classes) to do away with that `dynamic_cast`. — 1201ProgramAlarm, Apr 12 '17 at 12:52
Okay, will try to replace `unique_ptr` with shared one and re-do the hashes as virtual functions. Thank you for help. If it still won't work, I will edit the question — Fureeish, Apr 12 '17 at 12:54
@1201ProgramAlarm, unfortunately, replacing `unique_ptr` with `shared_ptr` has resulted in no changes to program's behaviour. Additionally, pardon my lack of knowledge, but I have no idea how to implement hash functions as you said. Are they just `operator()`? — Fureeish, Apr 12 '17 at 13:04
Avoid `new`, and use directly `make_unique` (and so `add_sell_obj(std::uniqueptr)`). Compiler would spot then your error. And problem would also occur in interface with `remove_sell_obj(std::uniqueptr)`... — Jarod42, Apr 12 '17 at 13:24
As others pointed out already, the main issue is that you have two unique_ptrs to the same object created by `book *b1 = new Book(...);`: One created by the call to `add_sell_obj` and one created by the call to `remove_sell_obj`. Both of these will eventually call `delete` on the object pointed to by `b1`, so you have a double-free and hence a crash. Using a `shared_ptr` doesn't help with the way this is structured currently, because you'll create two `shared_ptr`s *with their own reference count* instead of two `shared_ptr`s that share a reference count. So it will be deleted twice again. — Corristo, Apr 12 '17 at 13:28
@Jarod42, `std::make_unique` is not a part od C++11 which I am using. Rest of your comment is pointless since I was not using what you suggested as bad. — Fureeish, Apr 12 '17 at 13:31
@Corristo I understand what you are saying and now I can see the problem. Do you have any idea how to implement this properly then? — Fureeish, Apr 12 '17 at 13:32
In order to fix this there are several possibilities, but it depends on your use-case. Currently a user of the shop already needs a pointer to the object stored in it. How is the user supposed to get that pointer? — Corristo, Apr 12 '17 at 13:33
EDIT (I misunderstood your question) - the main idea is to have a pointer in main which will be used to both add and remove objects. In order to remove an object, we change the pointer in main however we like (we change members) and pass it to `remove_sell_obj()` — Fureeish, Apr 12 '17 at 13:36
I'm having trouble understanding what this `Shop` class is supposed to do. Why do you want to add pointers to the shop multiple times? When should a client call `remove_sell_obj`? — Corristo, Apr 12 '17 at 13:42
If you change the object through the pointer in main, you'll also modify the object already in the shop. Is that what you want? — Corristo, Apr 12 '17 at 13:43
My bad for explaining like an idiot. I just read my comment again and facepalmed. My idea is this: We have a storeroom in a shop with unique_ptr's as keys and values of them as amounts. We now want user to specify a new object and either call add or remove (with that object as parameter). The pointers in storeroom should not point to that object. Everytime new object is added (size of map increases) pointer should create an identical, new object that it will point to — Fureeish, Apr 12 '17 at 13:50
Ok, then the map doesn't seem to make sense, because you'll never have a single `std::unique_ptr` mapped to a value > 1. What I don't get is why you don't remove the pointer entirely from the map but just set its value to 0. If you could remove the pointer when you "sell" the object instead of keeping a count of 0, you probably want to use a std::unordered_set instead. — Corristo, Apr 12 '17 at 13:56
Suppose we have a `unique_ptr` that points to an object of type **book** with name = "A", author = "A" and price = 10 in this map. Value of the key is 1. If a user wants to add the *exact same* **book**, the value should increase. Does it make sense? EDIT: What I meant by "everytime new object is added" I meant "everytime a new object is added and is different than objects that are pointer by `unique_ptr`s in the map" — Fureeish, Apr 12 '17 at 13:58
That conflicts with what you said before, because now if you modify the second book you added, you also will modify the first one you inserted. I think the main question is: Do you want to treat all books with the same name, author and price as the *identical* book, so that changing one of them changes all others that previously had the same author, name and price? — Corristo, Apr 12 '17 at 14:00
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/141560/discussion-between-corristo-and-fureeish). — Corristo, Apr 12 '17 at 14:06
@Fureeish: Think about ownership: as stated, the `unordered_map` would own the `obj_sell` as **key**, so you wont have `std::unique_ptr` to provide to remove them. A `std::unordered_map>` would make more sense, it still would own the object, but as value, so you can have the key to allow removal. — Jarod42, Apr 12 '17 at 14:29
@Fureeish Based on the discussion today, [this](https://wandbox.org/permlink/kkVIN825NV8gagaY) is what I came up with. Maybe you can flesh it out to a full implementation of what you need. If you have any questions regarding any of this stuff, feel free to ask. — Corristo, Apr 12 '17 at 23:21
the code makes sense to me. I'll just need to read more about lambdas. Thank you very much for help! — Fureeish, Apr 13 '17 at 12:02

score 0 · Answer 1 · answered Jul 29 '22 at 11:43

There are quite a few problems with logic in the question. First of all:

Consider class hierarchy with polymorphic sell_obj as base class. book and table inheriting from that class. We now know that we need to create a std::unordered_map<std::unique_ptr<sell_obj*>, unsigned int>.

In such cases std::unique_ptr<sell_obj*> is not what we would want. We would want std::unique_ptr<sell_obj>. Without the *. std::unique_ptr is already "a pointer".

As we are dealing with std::unordered_map, we should specify hashes for all three classes. To simplify things, I specified them in main like this: [...]

This is also quite of an undesired approach. This would require changing that part of the code every time we add another subclass in the hierarchy. It would be best to delegate the hashing (and comparing) polymorphically to avoid such problems, exactly as @1201programalarm suggested.

[...] implementation of two, crucial functions:

void Shop::add_sell_obj(sell_obj& s_o)
{
    std::unique_ptr<sell_obj> n_ptr(&s_o);
    storeroom[std::move(n_ptr)]++;
}

void Shop::remove_sell_obj(sell_obj& s_o)
{
    std::unique_ptr<sell_obj> n_ptr(&s_o);
    auto target = storeroom.find(std::move(n_ptr));
    if(target != storeroom.end() && target->second > 0) target->second--;
}

This is wrong for couple of reasons. First of all, taking an argument by non-const reference suggest modification of the object. Second of all, the creation of n_ptr from a pointer obtained by using & on an argumnet is incredibly risky. It assumes that the object is allocated on the heap and it is unowned. A situation that generally should not take place and is incredibly dangerous. In case where the passed object is on the stack and / or is already managed by some other owner, this is a recipe for a disaster (like a segfault).

What's more, it is more or less guaranteed to end up in a disaster, since both add_sell_obj() and remove_sell_obj() create std::unique_ptrs to potentially the same object. This is exactly the case from the original question's main(). Two std::unique_ptrs pointing to the same object result in double delete.

While it's not necessarily the best approach for this problem if one uses C++ (as compared to Java), there are couple of interesting tools that can be used for this task. The code below assumes C++20.

The class hierarchy

First of all, we need a base class that will be used when referring to all the objects stored in the shop:

struct sell_object { };

And then we need to introduce classes that will represent conrete objects:

class book : public sell_object {
    std::string title;

public:
    book(std::string title) : title(std::move(title)) { }
};

class table : public sell_object {
    int number_of_legs = 0;

public:
    table(int number_of_legs) : number_of_legs(number_of_legs) { }
};

For simplicity (but to still have some distinctions) I chose for them to have just one, distinct field (title and number_of_legs).

The storage

The shop class that will represent storage for any sell_object needs to somehow store, well, any sell_object. For that we either need to use pointers or references to the base class. You can't have a container of references, so it's best to use pointers. Smart pointers.

Originally the question suggested the usage of std::unordered_map. Let us stick with it:

class shop {

    std::unordered_map<
            std::unique_ptr<sell_object>, int,
    > storage;

public:
    auto add(...) -> void {
        ...
    }

    auto remove(...) -> void {
        ...
    }
};

It is worth mentioning that we chose std::unique_ptr as key for our map. That means that the storage is going to copy the passed objects and use the copies it owns to compare with elements we query (add or remove). No more than one equal object will be copied, though.

The fixed version of storage

There is a problem, however. std::unordered_map uses hashing and we need to provide a hash strategy for std::unique_ptr<sell_object>. Well, there already is one and it uses the hash strategy for T*. The problem is that we want to have custom hashing. Those particular std::unique_ptr<sell_object>s should be hashed according to the associated sell_objects.

Because of this, I opt to choose a different approach than the one proposed in the question. Instead of providing a global specialization in the std namespace, I will choose a custom hashing object and a custom comparator:

class shop {

    struct sell_object_hash {
        auto operator()(std::unique_ptr<sell_object> const& object) const -> std::size_t {
            return object->hash();
        }
    };

    struct sell_object_equal {
        auto operator()(
                std::unique_ptr<sell_object> const& lhs,
                std::unique_ptr<sell_object> const& rhs
        ) const -> bool {
            return (*lhs <=> *rhs) == 0;
        }
    };

    std::unordered_map<
            std::unique_ptr<sell_object>, int,
            sell_object_hash, sell_object_equal
    > storage;

public:
    auto add(...) -> void {
        ...
    }

    auto remove(...) -> void {
        ...
    }
};

Notice a few things. First of all, the type of storage has changed. No longer it is an std::unordered_map<std::unique_ptr<T>, int>, but an std::unordered_map<std::unique_ptr<T>, int, sell_object_hash, sell_object_equal>. This is to indicate that we are using custom hasher (sell_object_hash) and custom comparator (sell_object_equal).

The lines we need to pay extra attention are:

return object->hash();
return (*lhs <=> *rhs) == 0;

Onto them:

`return object->hash();`

This is a delegation of hashing. Instead of being an observer and trying to have a type that for each and every possible type derived from sell_object implements a different hashing, we require that those objects supply the sufficient hashing themselves. In the original question, the std::hash specialization was the said "observer". It certainly did not scale as a solution.

In order to achieve the aforementioned, we modify the base class to impose the listed requirement:

struct sell_object {
    virtual auto hash() const -> std::size_t = 0;
};

Thus we also need to change our book and table classes:

class book : public sell_object {
    std::string title;

public:
    book(std::string title) : title(std::move(title)) { }

    auto hash() const -> std::size_t override {
        return std::hash<std::string>()(title);
    }
};

class table : public sell_object {
    int number_of_legs = 0;

public:
    table(int number_of_legs) : number_of_legs(number_of_legs) { }

    auto hash() const -> std::size_t override {
        return std::hash<int>()(number_of_legs);
    }
};

`return (lhs <=> rhs) == 0;`

This is a C++20 feature called the three-way comparison operator, sometimes called the spaceship operator. I opted into using it, since starting with C++20, most types that desire to be comparable will be using this operator. That means we also need our concrete classes to implement it. What's more, we need to be able to call it with base references (sell_object&). Yet another virtual function (operator, actually) needs to be added to the base class:

struct sell_object {
    virtual auto hash() const -> std::size_t = 0;

    virtual auto operator<=>(sell_object const&) const -> std::partial_ordering = 0;
};

Every subclass of sell_object is going to be required to be comparable with other sell_objects. The main reason is that we need to compare sell_objects in our storage map. For completeness, I used std::partial_ordering, since we require every sell_object to be comparable with every other sell_object. While comparing two books or two tables yields strong ordering (total ordering where two equivalent objects are indistinguishable), we also - by design - need to support comparing a book to a table. This is somewhat meaningless (always returns false). Fortunately, C++20 helps us here with std::partial_ordering::unordered. Those elements are not equal and neither of them is greater or less than the other. Perfect for such scenarios.

Our concrete classes need to change accordingly:

class book : public sell_object {
    std::string title;

public:
    book(std::string title) : title(std::move(title)) { }

    auto hash() const -> std::size_t override {
        return std::hash<std::string>()(title);
    }

    auto operator<=>(book const& other) const {
        return title <=> other.title;
    };

    auto operator<=>(sell_object const& other) const -> std::partial_ordering override {
        if (auto book_ptr = dynamic_cast<book const*>(&other)) {
            return *this <=> *book_ptr;
        } else {
            return std::partial_ordering::unordered;
        }
    }
};

class table : public sell_object {
    int number_of_legs = 0;

public:
    table(int number_of_legs) : number_of_legs(number_of_legs) { }

    auto hash() const -> std::size_t override {
        return std::hash<int>()(number_of_legs);
    }

    auto operator<=>(table const& other) const {
        return number_of_legs <=> other.number_of_legs;
    };

    auto operator<=>(sell_object const& other) const -> std::partial_ordering override {
        if (auto table_ptr = dynamic_cast<table const*>(&other)) {
            return *this <=> *table_ptr;
        } else {
            return std::partial_ordering::unordered;
        }
    }
};

The overriden operator<=>s are required due to the base class' requirements. They are quite simple - if the other object (the one we are comparing this object to) is of the same type, we delegate to the <=> version that uses the concrete type. If not, we have a type mismatch and we report the unordered ordering.

For those of you who are curious why the <=> implementation that compares two, identical types is not = defaulted: it would use the base-class comparison first, which would delegate to the sell_object version. That would dynamic_cast again and delegate to the defaulted implementation. Which would compare the base class and... result in an infinite recursion.

`add()` and `remove()` implementation

Everything seems great, so we can move on to adding and removing items to and from our shop. However, we immediately arrive at a hard design decision. What arguments should add() and remove() accept?

std::unique_ptr<sell_object>? That would make their implementation trivial, but it would require the user to construct a potentially useless, dynamically allocated object just to call a function.
sell_object const&? That seems correct, but there are two problems with it: 1) we would still need to construct an std::unique_ptr with a copy of passed argument to find the appropriate element to remove; 2) we wouldn't be able to correctly implement add(), since we need the concrete type to construct an actual std::unique_ptr to put into our map.

Let us go with the second option and fix the first problem. We certainly do not want to construct a useless and expensive object just to look for it in the storage map. Ideally we would like to find a key (std::unique_ptr<sell_object>) that matches the passed object. Fortunately, transparent hashers and comparators come to the rescue.

By supplying additional overloads for hasher and comparator (and providing a public is_transparent alias), we allow for looking for a key that is equivalent, without needing the types to match:

struct sell_object_hash {
    auto operator()(std::unique_ptr<sell_object> const& object) const -> std::size_t {
        return object->hash();
    }

    auto operator()(sell_object const& object) const -> std::size_t {
        return object.hash();
    }

    using is_transparent = void;
};

struct sell_object_equal {
    auto operator()(
            std::unique_ptr<sell_object> const& lhs,
            std::unique_ptr<sell_object> const& rhs
    ) const -> bool {
        return (*lhs <=> *rhs) == 0;
    }

    auto operator()(
            sell_object const& lhs,
            std::unique_ptr<sell_object> const& rhs
    ) const -> bool {
        return (lhs <=> *rhs) == 0;
    }

    auto operator()(
            std::unique_ptr<sell_object> const& lhs,
            sell_object const& rhs
    ) const -> bool {
        return (*lhs <=> rhs) == 0;
    }

    using is_transparent = void;
};

Thanks to that, we can now implement shop::remove() like so:

auto remove(sell_object const& to_remove) -> void {
    if (auto it = storage.find(to_remove); it != storage.end()) {
        it->second--;
        if (it->second == 0) {
            storage.erase(it);
        }
    }
}

Since our comparator and hasher are transparent, we can find() an element that is equivalent to the argument. If we find it, we decrement the corresponding count. If it reaches 0, we remove the entry completely.

Great, onto the second problem. Let us list the requirements for the shop::add():

we need the concrete type of the object (merely a reference to the base class is not enough, since we need to create matching std::unique_ptr).
we need that type to be derived from sell_object.

We can achieve both with a constrained* template:

template <std::derived_from<sell_object> T>
auto add(T const& to_add) -> void {
    if (auto it = storage.find(to_add); it != storage.end()) {
        it->second++;
    } else {
        storage[std::make_unique<T>(to_add)] = 1;
    }
}

This is, again, quite simple

*References: {1} {2}

Correct destruction semantics

There is only one more thing that separates us from the correct implementation. It's the fact that if we have a pointer (either smart or not) to a base class that is used to deallocate it, the destructor needs to be virtual.

This leads us to the final version of the sell_object class:

struct sell_object {
    virtual auto hash() const -> std::size_t = 0;

    virtual auto operator<=>(sell_object const&) const -> std::partial_ordering = 0;

    virtual ~sell_object() = default;
};

See full implementation with example and additional printing utilities.

Using std::unique_ptr of a polymorphic class as key in std::unordered_map

1 Answers1