7

The question is what is the recommended way to use std::list to achieve O(1) erasure of list items?

Usually, when I choose a doubly linked list, I want to be able to remove an element from a list in O(1) time, and then move it to a different list in O(1) time. If the element has its own prev and next pointers, there is no real trick to getting the job done. If the list is a doubly linked circular list, then removal doesn't necessarily require knowing the list that contains the item.

As per Iterator invalidation rules, std::list iterators are very durable. So, it seems to me to get the behavior I desire when using std::list on my own item is to stash an iterator within my class, and the containing list.

class Item {
    typedef std::shared_ptr<Item> Ptr;
    struct Ref {
        std::list<Ptr>::iterator iter_;
        std::list<Ptr> *list_;
    };
    Ref ref_;
    //...
};

This has the downside that I will need to create my own decorated version of std::list that knows to update the ref_ whenever the item is added to a list. I can't think of a way that doesn't require the embedded iterator, since not having one would mean erasure would incur a O(n) find operation first.

What is the recommended way to get O(1) erasure with std::list? Or, is there a better approach to achieving the objective?


In the past, I have accomplished this by implementing my own list data structure, where the item placed in the list has its own next and prev pointers. Managing these pointers is natural, since they are inherent to the list operations themselves (the API to my list implementation adjusts the pointers). If I want to use the STL instead, what would be the best way to accomplish this? I offer the straw-man proposal of embedding the iterator. Are there better approaches?

If a concrete use-case is desired, consider a timer implementation. When a timer is created, it is placed into an appropriate list. If it is canceled, it is desirable to efficiently remove it. (This particular example can be solved via marking instead of removal, but it is a valid way to implement cancellation.) Additional use-cases are available upon request.


Another alternative I explored was to fuse a std::list with a std::unordered_map to create a specialized list for pointer types. This is more heavyweight (because of the hash table), but provides a container that is pretty close to the standard containers at the interface level, and gives me O(1) erasure of list elements. The only feature missing from the straw-man proposal is a pointer to the list which currently contains the item. I have put up the current implementation at CodeReview to solicit comment.

Community
  • 1
  • 1
jxh
  • 69,070
  • 8
  • 110
  • 193
  • 6
    Just think about it. You can never access the random element of linked list implementation in O(1) time, it will always be O(n). Only head and tail can be accessed within constant time. The fact that you need to be able to access certain element (random access) simply suggests by itself that you are using the wrong container, and should opt to `vector` or something similar. Creating such heavyweight "overkill" wrappers is both inefficient (in terms of memory and performance) and impractical from my point of view. – Alexander Shukaev Apr 18 '13 at 00:33
  • So the node in the list wants to remove itself from the list? – cppguy Apr 18 '13 at 00:33
  • @cppguy: In a nutshell, yes. – jxh Apr 18 '13 at 00:34
  • 5
    Look into Boost's intrusive containers, sounds more like what you want. – GManNickG Apr 18 '13 at 00:35
  • @Haroogan: If you have a better container in mind, I am open, but it has to have the same space and time complexity as `std::list`. My solution in the past was to implement my own list data structure to allow me this property. I am seeking ways to achieve the same property with STL. – jxh Apr 18 '13 at 00:37
  • yeah sorry, realised I missed that bit. It is, of course, impossible to do this unless your object is *itself* maintaining the links. Otherwise you will always have something pointing to your object and the next/previous nodes. With this, your object needs to have the address for its own container, or you must search for it each time. The solution is to manage the linked list inside your objects, if practical. – Dave Apr 18 '13 at 00:38
  • @user315052: No container has _"the same space and time complexity"_ except the `std::list` itself `:)`. You'd have to give up something in any case, and you should carefully think what is it in the context of your problem. I cannot suggest anything special right now because I don't know why you need it. – Alexander Shukaev Apr 18 '13 at 00:39
  • Have you tried just using std::map? – derpface Apr 18 '13 at 00:56
  • @uberwulu: There is no key, per se. Also, the operations are not O(1). – jxh Apr 18 '13 at 00:57
  • @Haroogan: I don't really give anything up if I am willing to embed the iterator. I just have to write a big wrapper to manage it for every list operation. You are saying it is not worth the effort to do so. Do you still think that is true if the alternative is to implement my own list altogether? – jxh Apr 18 '13 at 01:03
  • If you store the iterator for each element within the element, you might be able to call http://www.cplusplus.com/reference/list/list/erase/. complexity of erase is O(n) where n is the number of removed elements. And yes, `std::list` is a double linked list. But references on the location within the list should not be stored outside of the list. it's the job of the list to know these things, not yours. – scones Apr 18 '13 at 01:08
  • 5
    First of all, I still don't get why you require linked list so urgently, while knowing that its memory footprint is substantially higher than in `vector` for example. Secondly, because its elements are actually scattered across memory (do not form a continuous block like `vector` does), the cache misses will likely be 99%, which means it is going to be slow, no it is **really** going to be **slow**. Do you realize all these details? So I ask you again, are you absolutely sure that you need linked list with additional heavyweight wrapper over it (which will decrease efficiency even more)? – Alexander Shukaev Apr 18 '13 at 01:09
  • Furthermore, I suppose you realize that those iterators that you are going to save in your wrapper will be invalidated as soon as list is changed. What would you do about it? – Alexander Shukaev Apr 18 '13 at 01:11
  • @Haroogan: Without getting into specifics, let's just assume that I have already solved the object locality issue. What makes you believe the wrapper will be so heavyweight? As I stated in my question, list iterators are very durable (very hard to invalidate). – jxh Apr 18 '13 at 01:12
  • That's just the straightforward consequence. Assume the new element is inserted (or deleted) to the list near what does your wrapper do in this situation? Recompute iterators for all `Item`s? – Alexander Shukaev Apr 18 '13 at 01:20
  • @Haroogan: That's a faulty assumption. No such recompute is required. – jxh Apr 18 '13 at 01:21
  • 1
    Yep, sorry, forgot about that. Now I see your point there. Then the last question is whether the objects that you want to store in `list` have size `>>` than 8 to 16 bytes? I still don't know your requirements to memory. – Alexander Shukaev Apr 18 '13 at 01:27
  • @Haroogan: Around 64 bytes per object for the most common cases, but the list is used generically. – jxh Apr 18 '13 at 01:30
  • Well, that sounds like you'll be investing additional 50% for meta information, that might turn out to be substantial if you're aiming to manipulate huge `list`s. But if that's not the case, I guess you'll be fine with your proposed solution. I still don't know how you've managed locality of data, it looks like impossible with `list`. Even if you store these objects in the continuous array and store pointers to these objects into `list` (rather than objects themselves) the locality is still broken. – Alexander Shukaev Apr 18 '13 at 01:48
  • @Haroogan: Since I need to "move" an object from one list to another, it is a pointer, as hinted in my proposal. So, arrays would have the same issue. – jxh Apr 18 '13 at 01:50
  • 1
    I can see how something like this could be useful. However, maybe you could redesign your API to hand out iterators to clients, instead of the objects themselves? If you typedef it, the client wouldn't even need to know that it's an iterator; you can simply say that it's a pointer-like object. – Thomas Apr 21 '13 at 08:32
  • @Thomas: I think I understand, but could you provide a rough sketch in code of the idea in an answer? – jxh Apr 21 '13 at 08:40

5 Answers5

4

std::list::erase is guaranteed to be O(1).

There is not an awful lot of other ways to erase elements from a standard list. (std::list::remove and friends don't do quite the same thing so they don't count).

If you want to erase from a standard list, you need an iterator and the list itself. That's what you seem to already have. There is not very much freedom of doing it differently. I would keep the list containment separate from the objects, unlike what you have done, because why make an object that can be in only one list at a time? Seems like an unnecessary artificial restriction to me. But whatever powers your design.

n. m. could be an AI
  • 112,515
  • 14
  • 128
  • 243
  • I've provided a "complete" solution in an answer, and it addresses your last issue. – jxh Apr 21 '13 at 15:57
2

Maybe you could redesign your interface to hand out iterators instead of raw objects? In the case of your timers example:

class Timer {
  // ...
};

typedef std::list<Timer>::iterator TimerRef;

class Timers {

  public:

    TimerRef createTimer(long time);
    void cancelTimer(TimerRef ref);

  private:

    std::list<Timer> timers;
};

Of course, instead of

timer.cancel();

users of the class now have to say

timers.cancelTimer(timerRef);

but depending on your use case, that might not be a problem.


Update: moving timers between lists:

class Timers {

  public:

    Timer removeTimer(TimerRef ref);
    void addTimer(Timer const &timer);

    // ...

};

Usage:

timers2.addTimer(timers1.removeTimer(timerRef));

Admittedly it's a bit cumbersome, but so are the alternatives.

Thomas
  • 174,939
  • 50
  • 355
  • 478
  • I see, but I can't really move a timer from one list to a different list with this approach. Oh, I see, I would have to copy out the underlying data when moving to a different list. Could get tricky. Is the data still valid when the iterator is used to remove the item from a list? – jxh Apr 21 '13 at 08:50
  • Can be done; see edit. Basically, the trick is to distinguish between the types "a free-floating timer" (`Timer`) and "a timer somewhere in a list" (`TimerRef`). – Thomas Apr 21 '13 at 08:53
  • The other issues I have are minor, but the last tricky one is how would a client holding an iterator know it is invalid (say because the list it belonged to has been destroyed)? – jxh Apr 21 '13 at 09:04
  • It wouldn't. Then again, if the list has been destroyed, the client shouldn't have any pointers/references to it anymore, so it couldn't even try to use the invalid iterator anyway. – Thomas Apr 21 '13 at 09:07
  • If the `Timers` knew each `Timer` held an iterator, it could visit each member in its list, and set each member's iterator to `end()` or something like that. I suppose you think my "complete" solution below is too heavyweight. – jxh Apr 21 '13 at 09:14
  • Without knowing the exact situation, I can't decide which solution is superior. I was only offering an alternative. If your "heavyweight" solution works for you, great :) – Thomas Apr 21 '13 at 09:15
1

There is no way to have O(1) erasure from std::list.

you may want to consider using an intrusive list, where list nodes are directly imbedded into the structures, like you have already done.

you can use boost::intrusive or roll your own, also check out this

yngccc
  • 5,594
  • 2
  • 23
  • 33
  • Wouldn't embedding an `iterator` in the item be similar to the intrusive list approach, and allow for O(1) erasure? – jxh Apr 18 '13 at 02:39
  • @user315052 you are right, your approach is O(1). intrusive list can be useful if you want multiple lists to contain the same object but I don't see how it is better than imbedded iterator either. – yngccc Apr 18 '13 at 03:18
  • @yngum an intrusive list uses slightly less memory and needs one fewer pointer dereference, which could be valuable in some cases. Downside is the added complexity. – Dave Apr 18 '13 at 10:02
  • @Dave: `boost::intrusive::list` seems harder to use because it internally stores references to the object being pushed into it. While it is cool from an implementation point of view, it makes memory management of the objects trickier. It's like you have to have a separate container for the object itself before you put it into the `boost::intrusive::list`. – jxh Apr 23 '13 at 06:13
1

Here's a "complete" solution using an embedded iterator. Some private traits are used to help reduce clutter in the class:

template <typename T> class List;

template <typename T>
class ListTraits {
protected:
    typedef std::list<std::shared_ptr<T>> Impl;
    typedef typename Impl::iterator Iterator;
    typedef typename Impl::const_iterator ConstIterator;
    typedef typename Impl::reverse_iterator Rotareti;
    typedef typename Impl::const_reverse_iterator ConstRotareti;
    typedef std::map<const List<T> *, typename Impl::iterator> Ref;
};

As shown, the list implementation will be using std::list, but the underlying value type will be a std::shared_ptr. What I am after is allowing an instance of T to efficiently derive its own iterator, to achieve O(1) erasure. This is done by using a Ref to memoize the iterator of the item after it is inserted into the list.

template <typename T>
class List : public ListTraits<T> {
    template <typename ITER> class IteratorT;
    typedef ListTraits<T> Traits;
    typename Traits::Impl impl_;
public:
    typedef typename Traits::Impl::size_type size_type;
    typedef typename Traits::Impl::value_type pointer;
    typedef pointer value_type;
    typedef IteratorT<typename Traits::Iterator> iterator;
    typedef IteratorT<typename Traits::ConstIterator> const_iterator;
    typedef IteratorT<typename Traits::Rotareti> reverse_iterator;
    typedef IteratorT<typename Traits::ConstRotareti> const_reverse_iterator;
    class Item;
    ~List () { while (!empty()) pop_front(); }
    size_type size () const { return impl_.size(); }
    bool empty () const { return impl_.empty(); }
    iterator begin () { return impl_.begin(); }
    iterator end () { return impl_.end(); }
    const_iterator begin () const { return impl_.begin(); }
    const_iterator end () const { return impl_.end(); }
    reverse_iterator rbegin () { return impl_.rbegin(); }
    reverse_iterator rend () { return impl_.rend(); }
    const_reverse_iterator rbegin () const { return impl_.rbegin(); }
    const_reverse_iterator rend () const { return impl_.rend(); }
    pointer front () const { return !empty() ? impl_.front() : pointer(); }
    pointer back () const { return !empty() ? impl_.back() : pointer(); }
    void push_front (const pointer &e);
    void pop_front ();
    void push_back (const pointer &e);
    void pop_back ();
    void erase (const pointer &e);
    bool contains (const pointer &e) const;
};

This List follows mostly a queue like interface. But, an item may be removed from any position in the list. The simple functions mostly just delegate to the underlying std::list. But the push_*() and pop_*() methods also memoize the iterator.

template <typename T>
template <typename ITER>
class List<T>::IteratorT : public ITER {
    friend class List<T>;
    ITER iter_;
    IteratorT (ITER i) : iter_(i) {}
public:
    IteratorT () : iter_() {}
    IteratorT & operator ++ () { ++iter_; return *this; }
    IteratorT & operator -- () { --iter_; return *this; }
    IteratorT operator ++ (int) { return iter_++; }
    IteratorT operator -- (int) { return iter_--; }
    bool operator == (const IteratorT &x) const { return iter_ == x.iter_; }
    bool operator != (const IteratorT &x) const { return iter_ != x.iter_; }
    T & operator * () const { return **iter_; }
    pointer operator -> () const { return *iter_; }
};

This is the implementation of the helper template used to define the iterator types for List. What it does differently is that the * and -> operators are defined in a way that makes the iterator behave like it is a T * rather than a std::shared_ptr<T> * (which is what the underlying iterator would normally do).

template <typename T>
class List<T>::Item {
    friend class List<T>;
    mutable typename Traits::Ref ref_;
};

A type T that is derived from List<T>::Item can be added to a List<T>. This base class contains the Ref instance used to memoize the iterator when the item is added to a list.

template <typename T>
inline void List<T>::push_front (const pointer &e) {
    const Item &item = *e;
    typename Traits::Ref::iterator i = item.ref_.find(this);
    if (i == item.ref_.end()) {
        item.ref_[this] = impl_.insert(impl_.begin(), e);
    } else if (front() != e) {
        impl_.erase(i->second);
        i->second = impl_.insert(impl_.begin(), e);
    }
}

template <typename T>
inline void List<T>::pop_front () {
    if (!empty()) {
        const Item &item = *front();
        item.ref_.erase(this);
        impl_.pop_front();
    }
}

This code illustrates how the memoization is performed. When doing push_front(), the item is first checked to see if it is already contained. If it is not, it is inserted, and the resulting iterator is added to the ref_ object. Otherwise, if it is not already the front, then the item is removed and reinserted at the front, and the memoized iterator is updated. pop_front() removes the memoized iterator, and then calls pop_front() on the the std::list.

template <typename T>
inline void List<T>::push_back (const pointer &e) {
    const Item &item = *e;
    typename Traits::Ref::iterator i = item.ref_.find(this);
    if (i == item.ref_.end()) {
        item.ref_[this] = impl_.insert(impl_.end(), e);
    } else if (back() != e) {
        impl_.erase(i->second);
        i->second = impl_.insert(impl_.end(), e);
    }
}

template <typename T>
inline void List<T>::pop_back () {
    if (!empty()) {
        const Item &item = *back();
        item.ref_.erase(this);
        impl_.pop_back();
    }
}

push_back() and pop_back() are similar to push_front() and pop_front().

template <typename T>
inline void List<T>::erase (const pointer &e) {
    const Item &item = *e;
    typename Traits::Ref::iterator i = item.ref_.find(this);
    if (i != item.ref_.end()) {
        item.ref_.erase(i);
        impl_.erase(i->second);
    }
}

The erase() routine retrieves the memoized iterator, and uses it to perform the erasure.

template <typename T>
inline bool List<T>::contains (const pointer &e) const {
    const Item &item = *e;
    typename Traits::Ref::iterator i = item.ref_.find(this);
    return i != item.ref_.end();
}

Since an item is its own iterator in many respects, a find() method should not be needed in this version of List. But, in lieu of that is this contains() method which is used to see if the element is a member of the list.

Now, the presented solution uses a std::map to associate list instances to iterators. This maintains the spirit of O(1) if the number of lists an item is a member of simultaneously is relatively small.

I will try my hand at a boost::intrusive version next.

jxh
  • 69,070
  • 8
  • 110
  • 193
  • The `List` class is missing copy/move constructor and assignment operator methods. – jxh Apr 22 '13 at 22:34
0

Awful truth: while the linked list is powerful structure, the std::list can't fully exploit it's capabilities.

You can't erase the object from std::list using only iterators, because list has to deallocate the node, and you have to know where is the allocator that allocated the memory (hint: it's in a list).

Intrusive containers have many advantages over standard linked list, like self-awareness ;-), ability to store polymorphic objects by value and they make list tricks (like having single objects in multiple lists) feasible. Since you don't use std::list directly anyway, you can drop usage of std::list altogether and use third-party container, or roll your own.

(Also, your solution is also intrusive, since your type have to be derived from List<T>::Item, which places certain requirements on the type that std::list doesn't)

milleniumbug
  • 15,379
  • 3
  • 47
  • 71