2

I have a std::vector with thousands of objects, stored as shared_ptr. Since the object has many attributes that can be used for searching, I maintain multiple indexes to this std::vector on std::map and std::multimap using weak_ptr.

std::vector<std::shared_ptr<Object>> Objects;
std::map<int,std::weak_ptr<Object>> IndexByEmployeeId;
std::multimap<std::string,std::weak_ptr<Object>> IndexByName;

Since the map and multimap are balanced binary trees, the search/modify are super fast. However, I am a bit foxed about delete. I want to delete after looking up the object via one of the indexes. Locking the weak_ptr gives me shared_ptr, but it doesn't allow me to destroy object on the vector. Is there any way to delete the original object on the vector?

Sharath
  • 1,627
  • 2
  • 18
  • 34
  • 8
    Why do you even want to manually delete an object held by a `shared_ptr`? That defeats the purpose of using `shared_ptr` – UnholySheep May 04 '17 at 11:00
  • 4
    How about removing the `shared_pr` from the vector? There is no magic here - removing the the `shared_ptr` from the vector will call its destructor (unless you remove by moving). If there aren't any other `shared_ptr` instances holding up the ref-count to the referenced object, the destructor of the `shared_ptr` will automatically release the memory of the instance of `Object`. Remember to properly check the success of `std::weak_ptr::lock()` afterwards, i.e. check the result against `nullptr`, or simultaneously remove the `weak_ptr`s in your maps. – thokra May 04 '17 at 11:05
  • Removing from the vector means I have to iterate over the vector to find and remove it. That defeats the purpose of having indexes. Please note I am not trying to understand the working of smart_ptr/weak_ptr.. Is there any way to reset() the smart_ptr on the vector, without iterating over it? I only have a weak_ptr to the object. – Sharath May 04 '17 at 11:14
  • So do you have the index of the `shared_ptr` in `Objects`? If yes then what is the problem with calling `.reset()` on it? Otherwise no, you can't hack the `shared_ptr` through a `weak_ptr`, you need to access the actual one (which means finding it in the vector) – UnholySheep May 04 '17 at 11:22
  • I don't have the "i" to call Objects[i].reset() function. – Sharath May 04 '17 at 11:22
  • Besides, any numeric index I store will get invalidated when items are removed from the vector. – Sharath May 04 '17 at 11:24
  • 6
    You can't escape the fact that you need to know the location in the vector to erase it from the vector. I think you will have to reevaluate your whole structure. Could you make the `std::map` the owner/primary-index and use `std::unique_ptr` then other indexes could simply use raw pointers that can be found and removed quickly? – Galik May 04 '17 at 11:40
  • If you store the objects by `shared_ptr`, then they are **obviously** not owned by `Objects` and hence must not be deleted (but at most removed from `Objects`). – Walter May 04 '17 at 11:46
  • @Walter: Uhm, what? Everyone holding a `shared_ptr` has shared ownership of the `Object` instance - that's the whole point. The only "consensus" is: the last remaining owner is responsible for destruction (unless released prior to the last referencing `shared_ptr` dtor). The vector `Objects` is definitely an owner - the maps are not. – thokra May 04 '17 at 12:24

3 Answers3

2

This can be a use case where std::set is the appropriate choice over std::vector. std::set guarantees lookup, insertion and removal in logarithmic time. So you can lookup an object by index in one of your maps objects and then delete it in any container with log(N) performance.

I would suggest this approach if insertion/removal operations represent the performance bottleneck in your application.

By the way, consider carefully the actual need for shared_ptr, because shared_ptr comes with a certain performance overhead as opposed to unique_ptr for example. Your owner container could use unique_ptr and the various maps could simply use raw pointers.

Emerald Weapon
  • 2,392
  • 18
  • 29
2

It seems from what you've posted that your data structures are inappropriate for what you want to do.

  1. shared_ptr should only be used to express shared ownership. However, your posted code and your desire to delete the objects pointed to indicate that Objects is in fact the sole owner of its objects.
  2. Your vector<shared_ptr<object>> appears to be used only as data storage container (the desired functionality of searching by id or name is implemented elsewhere), but removing elements from a vector is typically expensive, so you may better use another type of container.

If there are no other shared_ptr to your objects, then your design is poor. In this case you shouldn't use any smart pointers but simply container<object> and maps into the that. For example something like this (not tested not even compiled):

struct data_t { std::string name; /* more data */ };
using objt_map = std::map<std::size_t,data_t>;
using object   = objt_map::value_type;    // pair<const size_t,data_t>
using obj_it   = objt_map::iterator;
using name_map = std::multimap<std::string, obj_it>;
objt_map Objects;
name_map NameMap;

std::forward_list<obj_it> find_by_name(std::string const&name) const
{
    auto range = NameMap.equal_range(name);
    std::forward_list<obj_it> result;
    for(auto it=range.first; it!=range.second; ++it)
        result.push_front(it->second);
    return result;
}

std::forward_list<obj_it> find_by_id(std::size_t id) const
{
    auto it = Objects.find(name);
    return {it == Objects.end()? 0:1, it};
}

void insert_object(std::size_t id, data_t const&data)
{
    auto it = Objects.find(id);
    if(it != Objects.end())
        throw std::runtime_error("id '"+std::to_string(id)+"' already known");
    Objects[id] = data;
    NameMap.emplace(data.name, Objects.find(id));
}

void delete_object(obj_it it)
{
    if(it==Objects.end())
        throw std::runtime_error("attempt to delete invalid object");
    auto range = NameMap.equal_range(it->second.name);
    for(auto i=range.first; i!=range.second; ++i)
        if(i->second==it) {
            NameMap.erase(i);
            break;
        }
    Objects.erase(it);
}

Note iterators of std::map remain valid upon insertion and deletion (of other objects), such that the returns from the finders are not invalidated by insertion and deletion. I use std::forward_list<obj_it> to return objects to allow for returning none or several.

Walter
  • 44,150
  • 20
  • 113
  • 196
  • The object is created elsewhere and comes to me as shared_ptr. Once the object is stored, it does own it. The object itself has members, some are again smart_ptr. The smart_ptr was used because it is very easy to transfer the objects across functions while avoiding copying every time. I can check the feasibility of using unique_ptr and move() instead of smart_ptr. – Sharath May 04 '17 at 14:02
  • 1
    @Sharath It sounds like it probably should not be created as a `shared_ptr` if your part of the system owns it. Do you have any control over that? Usually when objects are created they are placed in a `unique_ptr` because that gives the consumer the choice to move it into a `shared_ptr` only if they need that. – Galik May 04 '17 at 14:07
  • I do have control over the whole system. But the object gets created in a different module and gets passed over a few functions, before it gets stored. I want to avoid copies since these are complex objects. – Sharath May 04 '17 at 14:15
  • @Walter, I am a bit concerned about how you are storing objt_map::iterator in the multimap. Won't the iterator get invalidated as more entries are inserted into the object map? – Sharath May 04 '17 at 14:19
  • 1
    @Sharath As I pointed out, the existing iterators of a [`std::map`](http://en.cppreference.com/w/cpp/container/map) remain valid upon insertion and deletion (apart from iterators to the deleted objects, of course). – Walter May 04 '17 at 14:37
  • 1
    Obtaining the objects via `smart_ptr` and then owning it makes no sense. If you own it, you should not import it via a `shared_ptr`, but by moving a `unique_ptr`. This is the natural C++ way to transfer ownership. A `shared_ptr` expresses shared ownership, thus the resource from which you obtain it may/does retain some share of the ownership and you must not delete the object (which defies the whole point of smart pointers: they should delete the object automatically). – Walter May 04 '17 at 14:40
  • Ah, I missed that comment. – Sharath May 04 '17 at 14:48
  • @Sharath Do you have C++14? – Walter May 04 '17 at 14:51
  • 1
    Alright, I think I will change it to unique_ptr and use the move function. And switch from vector to map. My code needs to compile under Debian (gcc 4.9.2) and Win64 (VC++ 2015). Not sure C++14 is fully supported. – Sharath May 04 '17 at 14:54
1

So, here is another option, based on importing the objects by moving std::unique_ptr<>. Unfortunately, unique_ptrs are not useful keys for std::set (since they are unique), unless you have C++14, when set::find() can take another argument than a key (see below).

For a C++11 approach, one must therefore use std::map to store the unique_ptrs, which requires doubling the id and name entries: once in data_t and once as keys in the maps. Here is a sketch.

struct data_t {
    const std::size_t id;       // changing these would
    const std::string name;     // lead to confusion
    /* more data */
};
using data_ptr = std::unique_ptr<data_t>;
using data_map = std::map<std::size_t,data_ptr>;
using obj_it   = data_map::iterator;
using name_map = std::multimap<std::string,obj_it>;
data_map DataSet;
name_map NameMap;

std::vector<data_t*> find_by_name(std::string const&name) const
{
    auto range = NameMap.equal_range(name);
    std::vector<data_t*> result;
    result.reserve(std::distance(range.first,range.second));
    for(auto it=range.first; it!=range.second; ++it)
        result.push_back(it->second->get());
    return result;
}

data_t* find_by_id(std::size_t id) const
{
    auto it = DataSet.find(id);
    return it == DataSet.end()? nullptr : it->second.get();
}

// transfer ownership here
void insert_object(data_ptr&&ptr)
{
    const auto id = ptr->id;
    if(DataSet.count(id))
        throw std::runtime_error("id '"+std::to_string(id)+"' already known");
    auto itb = DataSet.emplace(id,std::move(ptr));
    auto err = itb.second;
    if(!err)
        err = NameMap.emplace(itb.first->name,itb.first).second;
    if(err)
        throw std::runtime_error("couldn't insert id  "+std::to_string(id));
}

// remove object given an observing pointer; invalidates ptr
void delete_object(data_t*ptr)
{
    if(ptr==nullptr)
        return;                 // issue warning or throw ?
    auto it = DataSet.find(ptr->id);
    if(it==DataSet.end())
        throw std::runtime_error("attempt to delete an unknown object");
    auto range = NameMap.equal_range(it->second->name);
    for(auto i=range.first; i!=range.second; ++i)
        if(i->second==it) {
            NameMap.erase(i);
            break;
        }
    DataSet.erase(it);
}

Here is a sketch for a C++14 solution, which avoids the duplication of the id and name data in the maps, but requires/assumes that data_t::id and data_t::name are invariable.

struct data_t {
    const std::size_t id;       // used as key in set & multiset:
    const std::string name;     // must never be changed
    /* more data */
};

using data_ptr = std::unique_ptr<data_t>;

struct compare_id {
    using is_transparent = std::size_t;
    static bool operator(data_ptr const&l, data_ptr const&r)
    { return l->id < r->id; }
    static bool operator(data_ptr const&l, std::size_t r)
    { return l->id < r; }
    static bool operator(std::size_t l, data_ptr const&r)
    { return l < r->id; }
};

using data_set = std::set<data_ptr,compare_id>;
using data_it  = data_set::const_iterator;

struct compare_name {
    using is_transparent = std::string;
    static bool operator(data_it l, data_it r)
    { return (*l)->name < (*r)->name; }
    static bool operator(data_it l, std::string const&r)
    { return (*l)->name < r; }
    static bool operator(std::string const&l, data_it r)
    { return l < (*r)->name; }
};

using name_set = std::multiset<data_it,compare_name>;

data_set DataSet;
name_set NameSet;

std::vector<data_t*> find_by_name(std::string const&name) const
{
    auto range = NameSet.equal_range(name);
    std::vector<data_t*> result;
    result.reserve(std::distance(range.first,range.second));
    for(auto it=range.first; it!=range.second; ++it)
        result.push_back((*it)->get());
    return result;
}

data_t* find_by_id(std::size_t id) const
{
    auto it = DataSet.find(id);
    return it == DataSet.end()? nullptr : it->get();
}

// transfer ownership here
void insert_object(data_ptr&&ptr)
{
    const auto id = ptr->id;
    if(DataSet.count(id))
        throw std::runtime_error("id '"+std::to_string(id)+"' already known");
    auto itb = DataSet.emplace(std::move(ptr));
    auto err = itb.second;
    if(!err)
        err = NameSet.emplace(itb.first).second;
    if(err)
        throw std::runtime_error("couldn't insert id  "+std::to_string(id));
}

// remove object given an observing pointer; invalidates ptr
void delete_object(data_t*ptr)
{
    if(ptr==nullptr)
        return;                 // issue warning or throw ?
    auto it = DataSet.find(ptr->id);
    if(it==DataSet.end())
        throw std::runtime_error("attempt to delete an unknown object");
    auto range = NameSet.equal_range(ptr->name);
    for(auto i=range.first; i!=range.second; ++i)
        if((*i)==it) {
            NameSet.erase(i);
            break;
        }
    DataSet.erase(it);
}

There may well be some bugs in here, in particular errors with dereferencing the various iterator and pointer types (though once it compiles these should be okay).

Walter
  • 44,150
  • 20
  • 113
  • 196