Raw pointer lookup for sets of unique_ptrs

Question

I often find myself wanting to write code like this:

class MyClass
{
public:
  void addObject(std::unique_ptr<Object>&& newObject);

  void removeObject(const Object* target);

private:
  std::set<std::unique_ptr<Object>> objects;
};

However, much of the std::set interface is kind of useless with std::unique_ptrs since the lookup functions require std::unique_ptr parameters (which I obviously don't have because they're owned by the set itself).

I can think of two main solutions to this.

Create a temporary unique_ptr for lookup. For example, the above removeObject() could be implemented like:

void MyClass::removeObject(const Object* target)
{
  std::unique_ptr<Object> targetSmartPtr(target);
  objects.erase(targetSmartPtr);
  targetSmartPtr.release();
}

Replace the set with a map of raw pointers to unique_ptrs.

  // ...
  std::map<const Object*, std::unique_ptr<Object>> objects;
};

However, both seem slightly stupid to me. In solution 1, erase() isn't noexcept, so the temporary unique_ptr might delete the object it doesn't really own, and 2 requires double the storage for the container unnecessarily.

I know about Boost's pointer containers, but their current features are limited compared to modern C++11 standard library containers.

I was recently reading about C++14 and came across "Adding heterogeneous comparison lookup to associative containers". But form my understanding of it, the lookup types must be comparable to the key types, but raw pointers aren't comparable to unique_ptrs.

Anyone know of a more elegant solution or an upcoming addition to C++ that solves this problem?

Interesting, this looks like a design oversight in the container classes – there should be an easy way to do this. — Konrad Rudolph, Sep 22 '13 at 13:59
Agreed. Maybe there should be something like `std::ptr_less`, `std::ptr_equal_to`, `std::ptr_hash`, etc similar to Yakk's pointer_comp to simplify pointer comparisons/lookups. They'd go along nicely with C++1y's heterogeneous comparison lookup to associative containers. — jbatez, Sep 22 '13 at 16:03
Off topic: You shouldn't be taking the unique pointer by rvalue reference; just by value suffices. — Kerrek SB, Dec 01 '13 at 21:39
related: [Using custom std::set comparator](https://stackoverflow.com/questions/2620862/using-custom-stdset-comparator). — Ciro Santilli OurBigBook.com, Jun 05 '19 at 09:28

score 39 · Accepted Answer · edited Jun 08 '17 at 12:10

39

In C++14, std::set<Key>::find is a template function if Compare::is_transparent exists. The type you pass in does not need to be Key, just equivalent under your comparator.

So write a comparator:

template<class T>
struct pointer_comp {
  typedef std::true_type is_transparent;
  // helper does some magic in order to reduce the number of
  // pairs of types we need to know how to compare: it turns
  // everything into a pointer, and then uses `std::less<T*>`
  // to do the comparison:
  struct helper {
    T* ptr;
    helper():ptr(nullptr) {}
    helper(helper const&) = default;
    helper(T* p):ptr(p) {}
    template<class U, class...Ts>
    helper( std::shared_ptr<U,Ts...> const& sp ):ptr(sp.get()) {}
    template<class U, class...Ts>
    helper( std::unique_ptr<U, Ts...> const& up ):ptr(up.get()) {}
    // && optional: enforces rvalue use only
    bool operator<( helper o ) const {
      return std::less<T*>()( ptr, o.ptr );
    }
  };
  // without helper, we would need 2^n different overloads, where
  // n is the number of types we want to support (so, 8 with
  // raw pointers, unique pointers, and shared pointers).  That
  // seems silly:
  // && helps enforce rvalue use only
  bool operator()( helper const&& lhs, helper const&& rhs ) const {
    return lhs < rhs;
  }
};

then use it:

typedef std::set< std::unique_ptr<Foo>, pointer_comp<Foo> > owning_foo_set;

now, owning_foo_set::find will accept unique_ptr<Foo> or Foo* or shared_ptr<Foo> (or any derived class of Foo) and find the correct element.

Outside of C++14, you are forced to use the map to unique_ptr approach, or something equivalent, as the signature of find is overly restrictive. Or write your own set equivalent.

edited Jun 08 '17 at 12:10

Jean-Michaël Celerier

7,412
3
54
75

answered Sep 22 '13 at 05:18

Yakk - Adam Nevraumont

262,606
27
330
524

You can solve it in C++98 by passing a custom comparator to `std::lower_bound()`. In any case, interesting to see what's coming up in C++14. – Ali Sep 22 '13 at 11:00
1

@ali `std::lower_bound` is linear on `set` iterators. No random access. – Yakk - Adam Nevraumont Sep 22 '13 at 13:41
2

+1. Apparently, there is no way to do this with standard C++11 containers without significant compromises. Either something will run in linear time in the worst case or there are those workarounds mentioned in the OP. – Ali Sep 22 '13 at 15:30
Very cool. I know I didn't mention it in my question, but what about `unordered_set`? Will C++14 allow something similar if both `Hash::is_transparent` and `KeyEqual::is_transparent` exist? As a side note: that helper struct solves another persistent problem I've had with my C++ designs: cartesian products of parameter types. – jbatez Sep 22 '13 at 15:54
@jobates cpprrference implies that there is a similar `find` overload, but it is just a copy paste of the `set` clause. I would have to track down the proposal. – Yakk - Adam Nevraumont Sep 22 '13 at 17:33
@Yakk I saw that too but I can't find it anywhere in the draft standard or proposal. If it's not there, can you think of any reason why it shouldn't be? I might be in the mood to push for it. – jbatez Sep 26 '13 at 04:40
1

@JoBates A `transparent` hash function might be a tiny bit trickier than a `transparent` comparitor, but in some cases useful: A `std::string` hash that works with `const char*` buffers or `std::string_view` could save allocations when doing a lookup. And the "only part of the key is actually the key" thing also applies. – Yakk - Adam Nevraumont Sep 26 '13 at 09:50
Neat. You should add the full set of template parameters for the unique pointer, too, for maximal flexibility. – Kerrek SB Dec 01 '13 at 21:41
@kerreksb good idea. I could even use a `to_pointer` free function with `unique_ptr` and `shared_ptr` overloads, to allow ADL extension, if this was a serious library. – Yakk - Adam Nevraumont Dec 02 '13 at 03:19
1

Great answer! Still trying to fully grasp C++11, but it seems there's already a compelling reason for transitioning to C++14. What's odd, is that unique_ptr + stl container is hailed everywhere as something great; didn't find any mention of this problem, which I perceive as a major shortcoming. – pauluss86 Feb 09 '14 at 17:06
@pauluss86 In 80%+ of situations, the container you actually want is `std::vector`, even when you think you want something different. In the other 20%, 80%+ of them you want an `unordered_set` or `unordered_map`. In the remaining 4%, you probably want a container that isn't in the `std::` namespace. `std::map`, `std::set`, `std::list` and `std::deque` are corner-case containers in my experience. – Yakk - Adam Nevraumont Feb 09 '14 at 17:22
1

@Yakk, I actually wanted an `unordered_set` of `unique_ptr`s, because I need fast lookup (not by index) and the container changes frequently. Currently I use a `unordered_map` because it allows erasing by key. The real 'problem' here is: what are proper arguments for a `remove(xxx)` method that changes some private container containing `unique_ptr`s? – pauluss86 Feb 09 '14 at 18:28
@pauluss86 There is no way under C++03, 11 or 1y (as far as I know?) to `move` data out of an `unordered_set` or `set` (or otherwise mutate-while-removing), so `unique_ptr` is well not suited for storage in an `unordered_set`. I've seen some proposals or people talking about proposals to fix this. – Yakk - Adam Nevraumont Feb 10 '14 at 01:42
1

@Yakk: true; my intention was (lets say a `Register` for discussion purposes) that acts as a 'sink', by taking full ownership of pointers-to-instances passed into it. My problem was: how can a caller (after his initial pointer goes out of scope) 'unregister' an item later on? If `Register` uses a set of `unique_ptr`s, then the callers must have a `unique_ptr` to pass to the `Register` to erase it from the set. Obviously, this undermines the whole scheme. For that reason I went with a `unordered_map`: callers can remove items by key. – pauluss86 Feb 10 '14 at 07:52
I am convolutely coming to the same conclusion for the answer to [my similar question](http://stackoverflow.com/questions/30132951/create-multiple-indexes-into-a-large-collection-of-objects-with-smart-pointers). Thanks @pauluss86. Now... use map and increase my key storage space, or wait for C++14... – moodboom May 09 '15 at 22:33

score 4 · Answer 2 · answered Sep 09 '18 at 12:34

Another possibility, close to the accepted answer, but a little different and simplified.

We can exploit the fact that standard comparator std::less<> (with no template arguments) is transparent. Then, we can supply our own comparison functions in the global namespace:

// These two are enough to be able to call objects.find(raw_ptr)
bool operator<(const unique_ptr<Object>& lhs, const Object* rhs) {
  return std::less<const Object*>()(lhs.get(), rhs);
}
bool operator<(const Object* lhs, const unique_ptr<Object>& rhs) {
  return std::less<const Object*>()(lhs, rhs.get());
}

class MyClass
{
  // ...

private:
  std::set<std::unique_ptr<Object>, std::less<>> objects;  // Note std::less<> here
};

PSyton · Answer 3 · 2013-09-22T04:32:53.100

You can try to use boost::multi_index_container with additional indexing by Object*. Something like this:

typedef std::unique_ptr<Object> Ptr;
typedef multi_index_container<
  Ptr,
  indexed_by<
    hashed_unique<Ptr>,
    ordered_unique<const_mem_fun<Ptr,Object*,&Ptr::get> >
  >
> Objects;

Fore more information see Boost Multi-index Containers documentation

Or may be you can use std::shared_ptr everywhere, or use raw pointers in set instead?

Why you need to lookup by raw pinter? If you store it anywhere and check that object with this pointer is valid then better to use std::shared_ptr for storing in container and std::weak_ptr for other objects. In this case before usage you don't need lookup by raw pointer at all.

I was under the impression that you couldn't have a multi_index_container of unique_ptr because they were copied around. — Jean-Simon Brochu, Aug 24 '17 at 15:00

jbatez · Answer 4 · 2016-01-01T15:56:31.970

2

While definitely a hack, I just realized it's possible to construct a temporary "dumb" unique_ptr with placement new and not risk de-allocation. removeObject() could be written something like this:

void MyClass::removeObject(const Object* target)
{
  alignas(std::unique_ptr<Object>)
  char dumbPtrData[sizeof(std::unique_ptr<Object>)];

  objects.erase(
      *::new (dumbPtrData) std::unique_ptr<Object>(const_cast<Object *>(target)));
}

This solution would work for std::unordered_set, std::map keys, and std::unordered_map keys as well, all using standard C++11 only, with practically zero unnecessary overhead.

edited Jan 01 '16 at 15:56

answered Mar 13 '14 at 02:33

jbatez

1,772
14
26

1

To be more correct you'd need to use `alignas` to ensure the array is suitably aligned for the unique_ptr object. Another option is just `unique_ptr – Jonathan Wakely Jun 16 '15 at 16:21

score 1 · Answer 5 · edited May 23 '17 at 10:30

UPDATE 2: Yakk is correct, there is no way to do this with standard C++11 containers without significant compromises. Either something will run in linear time in the worst case or there are those workarounds that you write in your question.

There are two workarounds that I would consider.

I would try a sorted std::vector, similarly to boost::container::flat_set. Yes, the inserts / erases will be linear time in the worst case. Still, it might be much faster than you probably think: Contiguous containers are very cache friendly compared to node based containers, such as std::set. Please read what they write at boost::container::flat_set. Whether this compromise is acceptable for you, I cannot tell / measure.

Others also mentioned std::share_ptr. I personally try to avoid them, mainly because "a shared pointer is as good as a global variable" (Sean Parent). Another reason why I don't use them is because they are heavy weight, partly because of all the multi-threading stuff that I usually don't need. However, boost::shared_ptr, when BOOST_SP_DISABLE_THREADS is defined, removes all that overhead associated with multi-threading. I believe using boost::shared_ptr would be the easiest solution in your case.

UPDATE: As Yakk kindly pointed out, my approach has linear time complexity... :(

(The first version.)

You can do it by passing a custom comparator to std::lower_bound(). Here is a rudimentary implementation how:

#include <algorithm>
#include <cassert>
#include <iostream>
#include <memory>
#include <set>
#include <string>

using namespace std;

template <typename T>
class Set {

private:

    struct custom_comparator {
        bool operator()(const unique_ptr<T>& a, const T* const & b){
            return a.get() < b;
        }
    } cmp;

    set<unique_ptr<T>> objects; // decltype at begin() and end()
                                // needs objects to be declared here
public:

    auto begin() const -> decltype(objects.begin()) { return objects.begin(); }

    auto   end() const -> decltype(objects.end()  ) { return objects.end();   }

    void addObject(unique_ptr<T>&& newObject) {

        objects.insert(move(newObject));
    }

    void removeObject(const T* target) {

        auto pos = lower_bound(objects.begin(), objects.end(), target, cmp);

        assert (pos!=objects.end()); // What to do if not found?

        objects.erase(pos);
    }
};

void test() {

    typedef string T;

    Set<T> mySet;

    unique_ptr<T> a{new T("a")};
    unique_ptr<T> b{new T("b")};
    unique_ptr<T> c{new T("c")};

    T* b_ptr = b.get();

    mySet.addObject(move(a));
    mySet.addObject(move(b));
    mySet.addObject(move(c));

    cout << "The set now contains: " << endl;

    for (const auto& s_ptr : mySet) {

        cout << *s_ptr << endl;
    }

    mySet.removeObject(b_ptr);

    cout << "After erasing b by the pointer to it:" << endl;

    for (const auto& s_ptr : mySet) {

        cout << *s_ptr << endl;
    }
}

int main() {

    test();
}

score -1 · Answer 6 · answered Sep 22 '13 at 04:23

-1

You're using unique pinters here. This means, your set has unique ownership of objects. Now, this should mean that if object does exist, it's either in the set or you have unique pointer with it. You don't even need to look up the set in this case.

But to me it looks like it's not hte case. I suppose you're better off with shared pointer in this case. Just store shared pointers and pass them around since someone beside this set clearly stores them.

answered Sep 22 '13 at 04:23

Artem Tokmakov

1,135
10
7

9

No, not at all. `unique_ptr` is about *ownership*. It does not mean at all that there are no other pointers to the object, merely that there are no other owners. – Konrad Rudolph Sep 22 '13 at 14:00

Raw pointer lookup for sets of unique_ptrs

6 Answers6

Linked