34

Assume I have a set of unique_ptr:

std::unordered_set <std::unique_ptr <MyClass>> my_set;

I'm not sure what's the safe way to check if a given pointer exists in the set. The normal way to do it may be to call my_set.find (), but what do I pass as a parameter?

All I have from the outside is a raw pointer. So I have to create another unique_ptr from the pointer, pass it to find() and then release() that pointer, otherwise the object would get destructed (twice). Of course, this process can be done in a function, so the caller can pass the raw pointer and I do the conversions.

Is this method safe? Is there a better way to work with a set of unique_ptr?

  • Thanks. I don't need to move or copy anything, so unique_ptr is okay. I just need to let the caller give me a raw pointer, and I need to check if a matching unique_ptr exists in the set. – cfa45ca55111016ee9269f0a52e771 Jul 25 '13 at 07:11
  • 1
    `unique_ptr` is obviously not what you need, since you clearly have other pointers to the object. – James Kanze Jul 25 '13 at 08:43
  • 6
    The owner of the unique_ptr is the only owner of the memory, and all others just hold references. I could use shared::ptr in the owner and weak_ptr everywhere else, but then each object is referenced by a single shared_ptr. I don't need the sharing, just a single owner – cfa45ca55111016ee9269f0a52e771 Jul 25 '13 at 08:45
  • 11
    @JamesKanze I don't see why a `std::unique_ptr` has to be the only pointer to some object. The *"unique"* doesn't stand for *unique address*, but *unique ownership*. A `std::shared_ptr` that doesn't own its pointee is a far worse correctness crime. – Christian Rau Jul 25 '13 at 08:47
  • 2
    Not a duplicate. The linked question asks for `set`, this one for `unordered_set` and this is a large difference. – IS4 Mar 03 '18 at 11:10
  • 1
    Agree with @IllidanS4. `set` has transparent comparators for `find()`, while `unordred_set` does not. – Mikhail Aug 27 '18 at 19:32

6 Answers6

29

You can also use a deleter that optionally doesn't do anything.

template<class T>
struct maybe_deleter{
  bool _delete;
  explicit maybe_deleter(bool doit = true) : _delete(doit){}

  void operator()(T* p) const{
    if(_delete) delete p;
  }
};

template<class T>
using set_unique_ptr = std::unique_ptr<T, maybe_deleter<T>>;

template<class T>
set_unique_ptr<T> make_find_ptr(T* raw){
    return set_unique_ptr<T>(raw, maybe_deleter<T>(false));
}

// ...

int* raw = new int(42);
std::unordered_set<set_unique_ptr<int>> myset;
myset.insert(set_unique_ptr<int>(raw));

auto it = myset.find(make_find_ptr(raw));

Live example.

Xeo
  • 129,499
  • 52
  • 291
  • 397
  • I'd be tempted to consider an [`implicit` constructor here so you can `set_unique_ptr(raw, false)`](http://coliru.stacked-crooked.com/view?id=69a8470528f5380f02eba643929357e7-e54ee7a04e4b807da0930236d4cc94dc) instead. Thoughts? (I know I know. Implicit conversions, _evil_. But, concretely, any catch for _this particular_ use case?) – sehe Jul 25 '13 at 09:20
  • While I love your solution, I would still qualify it as a hack. – Tanveer Badar Oct 07 '20 at 18:01
14

Note that the ability to do heterogenous lookups on standard containers is subject of some proposals.

http://cplusplus.github.io/LWG/lwg-proposal-status.html lists

  • N3465 Adding heterogeneous comparison lookup to associative containers for TR2 (Rev 2) [Handle with N3573]
  • N2882 id.
  • N3573 Heterogenous extensions to unordered containers [Handle with N3465]

Especially the latter looks like it would cover your use case.

For now, here is an IMO not very pretty but working alternative workaround (O(n)):

#include <iterator>
#include <iostream>
#include <algorithm>

#include <unordered_set>
#include <memory>

#include <cassert>

struct MyClass {};

template <typename T>
struct RawEqualTo
{
    RawEqualTo(T const* raw) : raw(raw) {}

    bool operator()(T const* p) const  
        { return raw == p; }
    bool operator()(std::unique_ptr<T> const& up) const  
        { return raw == up.get(); }

  private:
    T const* raw;
};


using namespace std;
int main()
{
    std::unordered_set <std::unique_ptr <MyClass>> my_set;

    my_set.insert(std::unique_ptr<MyClass>(new MyClass));
    my_set.insert(std::unique_ptr<MyClass>(new MyClass));

    auto raw = my_set.begin()->get();

    bool found = end(my_set) != std::find_if(begin(my_set), end(my_set), RawEqualTo<MyClass>(raw));
    assert(found);

    raw = new MyClass;

    found = end(my_set) != std::find_if(begin(my_set), end(my_set), RawEqualTo<MyClass>(raw));
    assert(!found);

    delete raw;
}

Warning It's also very inefficient, of course.

sehe
  • 374,641
  • 47
  • 450
  • 633
  • +1 For mentioning heterogeneous lookups (and at least a warning that `std::find_if` drives the use of a `std::unordered_set` ad absurdum). – Christian Rau Jul 25 '13 at 08:27
  • Heterogenous extensions solve the problem, they do *exactly* what I need :-) But the solution you propose here is indeed too inefficient. I need the O(1) average of unordered_set::find – cfa45ca55111016ee9269f0a52e771 Jul 25 '13 at 08:33
  • @Nawaz Because a `std::shared_ptr` would be absolutely misplaced if the memory just isn't shared. – Christian Rau Jul 25 '13 at 08:40
  • @Nawaz shared_ptr takes more memory. It's not necessarily an issue, but I simply have another option: create an extra unique_ptr, test against it and then release() it. I can catch exceptions throws by find(), so release() will always be called. Should be safe. And fast. – cfa45ca55111016ee9269f0a52e771 Jul 25 '13 at 08:41
  • 1
    @ChristianRau But it obviously _is_ shared, since there are clearly pointers to the objects outside the container. – James Kanze Jul 25 '13 at 08:44
  • @fr33domlover I assumed you didn't need/want help implementing the workaround you already described in your question :/ Heterogenous container interface is cool, but also fraught with pitfalls. I hope we get it, but only if they get it _right_. You can follow some of the discussion at iso-cpp.org – sehe Jul 25 '13 at 08:45
  • 2
    @fr33domlover `find` shouldn't throw (unless the hash function or the comparison function throws). But having two `unique_ptr` to the same object sounds like a recipe for disaster, and a maintenance nightmare, since it violates the invariants of `unique_ptr`. – James Kanze Jul 25 '13 at 08:48
  • @JamesKanze Yes, but I create and release the pointer locally. It's not more dangerous than using new and delete, and accidentally deleting an object too early. I handle the extra pointer inside the function, safely, with just a single call to find() between the creation and the release – cfa45ca55111016ee9269f0a52e771 Jul 25 '13 at 08:52
  • 1
    @JamesKanze Only if those other pointers are responsible for managing the object's lifetime (and thus *own* it). Who knows where the argument to that lookup comes from, it doesn't need to be a persistently stored lifetime managing reference. I'm not saying he might not need a `std::shared_ptr`, but that a mere lookup argument is not a sufficient reason for it. – Christian Rau Jul 25 '13 at 08:54
11

You can use a std::map<MyClass*, std::unique_ptr<MyClass>> instead of a set. Then you can add elements like this:

 std::unique_ptr<MyClass> instance(new MyClass);
 map.emplace(instance.get(), std::move(instance));
petersohn
  • 11,292
  • 13
  • 61
  • 98
  • Good idea! But it makes the set twice as large. It's not a problem in my case, but just wondering - will the method I described work and be completely safe? (by the way, it should be unordered_map, not map) – cfa45ca55111016ee9269f0a52e771 Jul 25 '13 at 07:07
  • 3
    -1. What is wrong with `std::set`? In this case, `std::map` is bad. Key and value are same object essentially! – Nawaz Jul 25 '13 at 07:07
  • 2
    @Nawaz the unique_ptr holds the memory while the raw pointer is used for find() and all other methods which search-by-key. – cfa45ca55111016ee9269f0a52e771 Jul 25 '13 at 07:10
  • 3
    @Nawaz Well, so how would you solve the "need to find a `unique_ptr` when I have only raw-pointer" problem? This answer solves it by introducing an explicit mapping. Note that `std::[unrodered_]set` is what the OP's currently got, and is asking how to do the search. – Angew is no longer proud of SO Jul 25 '13 at 07:14
  • You cannot "find a `unique_ptr`". It's unique, as its name indicates. On the other hand, you can find a `unique_ptr` belonging to a certain pointer this way. – petersohn Jul 25 '13 at 07:17
  • Actually, the unique_ptr hash is equal to the raw pointer hash, so if you make two unique_ptr from the same raw pointer, their hashes will be equal. That's why I suggested to make the extra one and then release() it – cfa45ca55111016ee9269f0a52e771 Jul 25 '13 at 07:18
  • @fr33domlover Your method may be safe, but it may not compile. You can try it. Alternatively you can use `shared_ptr`, but that's also twice as large as a unique_ptr, if not more. – petersohn Jul 25 '13 at 07:19
  • I agree it's not safe, that's why I asked. But assume I write a special find() method which takes the caller's raw pointer, creates the extra unique_ptr (not safe) but after the test it release()s the extra pointer (as a result we go back to the initial safe conditions). Is that safe? – cfa45ca55111016ee9269f0a52e771 Jul 25 '13 at 07:21
  • I compiled it with GCC, it works :) Still wondering about safety in general, if anyone has some advice – cfa45ca55111016ee9269f0a52e771 Jul 25 '13 at 07:34
  • @Angew: Using `std::find_if` and lambda. See my solution. – Nawaz Jul 25 '13 at 08:09
  • @fr33domlover: Why hell bent on using `std::set::find` when it doesn't solve it? Why not use `std::find_if` which solves this? See my solution. – Nawaz Jul 25 '13 at 08:14
  • Using the `release()` method is unsafe if there is a possibility of an exception being thrown before calling `release()`. – petersohn Jul 25 '13 at 08:15
  • The only thing before release() is find(), so I guess it depends on whether find() can throw – cfa45ca55111016ee9269f0a52e771 Jul 25 '13 at 08:28
  • @fr33domlover Even if it can, you can wrap it in a `try-catch` – Angew is no longer proud of SO Jul 25 '13 at 08:34
  • @Angrew you're right, this allows me to catch the exception locally, and then release() anyway. I guess it solves the problem :) – cfa45ca55111016ee9269f0a52e771 Jul 25 '13 at 08:36
  • 4
    @Nawaz `std::unordered_set::find` is O(1); `std::find_if` is O(n). That can make a difference very quickly. – James Kanze Jul 25 '13 at 08:51
  • 1
    @petersohn Using the `release()` method is unsafe if you ever have to modify the code, or maintain it in any fashion. Regardless of exceptions. You're lying to the compiler _and_ to anyone reading the code. – James Kanze Jul 25 '13 at 08:53
  • @JamesKanze I'll put a comment there, explaining clearly what I do. Besides, it may be just a temporary workaround until "Heterogenous extensions to unordered containers" gets supported, and then I'll be able to remove the workaround – cfa45ca55111016ee9269f0a52e771 Jul 25 '13 at 08:57
7

If the goal is constant time for the look up, I don't think that there is a solution. std::unordered_set<std::unique_ptr<MyClass>>::find requires an std::unique_ptr<MyClass> as argument. You will have to either change the container, or change the contained type.

One possibility might be to replace std::unique_ptr with std::shared_ptr, and change the rest of the code so that all MyClass are put into a shared_ptr as soon as they are created, and are only manipulated through shared pointers. Logically, this is probably more coherent anyway: unique_ptr pretty much implies (by its name, as well as its semantics) that there aren't other pointers to the object. On the other hand, you may not be able to use shared_ptr, if e.g. MyClass has pointers to other MyClass, which may build a cycle.

Otherwise, if you can accept O(lg n) access, rather than constant access (the difference generally doesn't become noticeable until the tables are fairly large), you can use an std::vector<MyClass>, using std::lower_bound to keep it sorted. Unlike std::unordered_set<>::find, std::lower_bound does not require the target value to have the same type as the value_type of the sequence; all you have to do is to ensure that they are comparable, say by providing a Compare object along the lines of:

class MyClassPtrCompare
{
    std::less<MyClass const*> cmp;
public:
    bool operator()( std::unique_ptr<MyClass> const& lhs,
                     std::unique_ptr<MyClass> const& rhs ) const
    {
        return cmp( lhs.get(), rhs.get() );
    }
    bool operator()( MyClass const* lhs,
                     std::unique_ptr<MyClass> const& rhs ) const
    {
        return cmp( lhs, rhs.get() );
    }
    bool operator()( std::unique_ptr<MyClass> const& lhs,
                     MyClass const* rhs ) const
    {
        return cmp( lhs.get(), rhs );
    }
    bool operator()( MyClass const* lhs,
                     MyClass const* rhs ) const
    {
        return cmp( lhs, rhs );
    }
};

Insertion may involve a number of moves, but moving a std::unique_ptr should be fairly cheap, and the improved locality of this solution might offset the additional runtime costs it otherwise imposes.

James Kanze
  • 150,581
  • 18
  • 184
  • 329
  • +1 Seems like a good compromise. Strange none of the `find_if` advocators cosidered this at all (but Ok, didn't think of this myself either). – Christian Rau Jul 25 '13 at 08:46
  • Good idea. Currently I don't mind O(lg n) but I plan to support large numbers of objects and lots of queries, so I prefer to use an average O(1) approach if possible – cfa45ca55111016ee9269f0a52e771 Jul 25 '13 at 08:47
  • 2
    @fr33domlover Yet you might be surprised how well that sorted vector will behave in practice, I guess. – Christian Rau Jul 25 '13 at 08:49
  • Good alternative solution using `std::vector`. It also uses **less* memory* than `std::set`. It reminds me of this article: [Why you shouldn't use set (and what you should use instead)](https://www.google.co.in/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CC0QFjAA&url=http%3A%2F%2Flafstern.org%2Fmatt%2Fcol1.pdf&ei=Z-fwUZK0OcT4rQfykYG4Cg&usg=AFQjCNEOAgURQSrd8Z8uBzP3m6l6Yd3qwA&sig2=1u6hVxYKHaPWgTPMH5kaSw&bvm=bv.49784469,d.bmk) – Nawaz Jul 25 '13 at 08:52
  • 4
    @fr33domlover From experience: you might find that the O(lg n) of the sorted vector outperforms the O(1) of the unordered_set, because of much better locality. – James Kanze Jul 25 '13 at 08:55
  • @ChristianRau oh I thought of it. But then, it was clearly just _not the question_. Also, I'd advocate a 'flat-storage set' but only if you reuse existing implementations. http://www.boost.org/doc/libs/1_54_0/doc/html/container/non_standard_containers.html#container.non_standard_containers.flat_xxx – sehe Jul 25 '13 at 09:27
  • I do not think that unique_ptr implies that there are no other pointers. It just implies that there is only one owner that determines the lifetime of the object pointed to. It is perfectly ok to have other pointers pointing to it, as long as they are not dereferenced after its lifetime. This is in fact what rust does by default, enforcing the lifetime constraints at compile time. In C++, you have no such help by the compiler, but when it is manageable, and you are tight on (hardware) resources, it is simply cheaper than a shared_ptr. – ovenror Jun 17 '21 at 17:01
1

If you can use Abseil, do it:

absl::flat_hash_set<std::unique_ptr<MyClass>> my_set;

just works :)

Mikhail
  • 20,685
  • 7
  • 70
  • 146
0

Here is the proper way to do it in C++20 with "Heterogeneous lookup for unordered containers" available:

struct Hash {
  using is_transparent = void;
  template <class P>
  size_t operator()(const P& p) const {
    return std::hash<P>{}(p);
  }
};
struct KeyEqual {
  using is_transparent = void;
  template <class P, class Q>
  bool operator()(const P& lhs, const Q& rhs) const {
    return std::to_address(lhs) == std::to_address(rhs);
  }
};

std::unordered_set<std::unique_ptr<MyClass>, Hash, KeyEqual> my_set;

More on the topic (in Russian): https://www.coursera.org/learn/c-plus-plus-brown/supplement/TtrLN/unordered-set-unique-ptr

Mikhail
  • 20,685
  • 7
  • 70
  • 146