11

Problem: I need to get a random element for a container and also delete it from that container. Container does not need to be sorted. I dont care about the order.

  • Vector can get me random element in O(1) but delete it only in O(N)
  • List deletes element in O(1) but can only get random element in O(N)

So I came up with an idea of making a custom vector that allow you to remove any element by its index with O(1)+ complexity. The idea is to swap the last element and an element you want to remove and then pop_back(). If you need to remove the last elemtent - just pop_back(). The order of the vector will not be the same but you get a fast remove method.

As i can understand deque have slower access by index and worse removal complexity then my solution but im not 100% sure.

I'm curious are there data structures that have random access and element removal in O(1) or O(logN) by index or mb by value ?

Stals
  • 1,543
  • 4
  • 27
  • 52
  • 2
    Why did you need to make a custom vector for this? Just swap the element to the end and remove it from there? This doesn't need to be a special class. – Nicol Bolas Feb 09 '12 at 21:08
  • I have given you a solution if you want to maintain the order of the elements, that would be O(log N) complexity. – CashCow Feb 09 '12 at 21:35
  • 1
    @NicolBolas He found a solution (not sure why he wanted a new collection for it) but asked if there is an O(1) or O(log N) solution. We know there is a constant time solution (as he found it himself) so the O(log N) could only mean one that maintains order. – CashCow Feb 09 '12 at 21:52
  • language agnostic: http://stackoverflow.com/questions/311703/algorithm-for-sampling-without-replacement – Ciro Santilli OurBigBook.com Feb 27 '17 at 11:21

3 Answers3

16

You have the solution, and it seems perfectly fine. The idiomatic way to write it in C++ is not to create another class (and please don't inherit from std::vector), but just to write a function:

template <typename T>
void remove_at(std::vector<T>& v, typename std::vector<T>::size_type n)
{
    std::swap(v[n], v.back());
    v.pop_back();
}

Usage:

remove_at(v, 42);

This offers the same exception guarantee as std::swap<T>.

Now if you want to return the object, and you have access to a C++11 compiler, you can do it the following way. The difficult part is to provide the basic exception guarantee in all cases:

template <typename T>
T remove_at(std::vector<T>&v, typename std::vector<T>::size_type n)
{
    T ans = std::move_if_noexcept(v[n]);
    v[n] = std::move_if_noexcept(v.back());
    v.pop_back();
    return ans;
}

Indeed, you don't want the vector to be left in an invalid state if an exception is thrown during a move operation.

Community
  • 1
  • 1
Alexandre C.
  • 55,948
  • 11
  • 128
  • 197
0

With O(n) complexity

vec.erase(vec.begin() + randomIdx); randomIdx would be between 0 and vec.size()-1

If you want O(1) complexity you could either use the list container for example or swap the element with the last one and delete this instead. ( As mentioned by the others )

guitarflow
  • 2,930
  • 25
  • 38
  • Really? Why would that be? That's just a pointer-reassignment actually, isn't it? – guitarflow Feb 09 '12 at 21:09
  • 1
    @guitarflow : Because every element after index `n` must be relocated. – ildjarn Feb 09 '12 at 21:10
  • How would that be pointer reassignment? It's an array, not an array of pointers. To erase an element from the middle of the array, you have to shift every element after it down one. – Nicol Bolas Feb 09 '12 at 21:11
  • Ok, I have not been aware of that. I thought the vector would be organized in a linked-list kinda manner. – guitarflow Feb 09 '12 at 21:13
  • @guitarflow : `std::vector<>` must have contiguous storage for its elements, so it effectively __must__ be implemented in terms of an array. (And if it was a linked list, why would there be a `std::list<>` as well?) – ildjarn Feb 09 '12 at 21:16
  • swapping the element you wish to remove with the last one in the vector would be constant time but would not maintain the order. I assume you want the order of the remaining elements to be maintained. – CashCow Feb 09 '12 at 21:32
0

Yes, there is a solution, a well-balanced binary tree.

One each node you would need to store the number of nodes on each side. From this it is O(log N) to find the nth element.

Removing the nth element is also O(log N) as you have to traverse back up to tree to correct all the counts. Any rebalancing would be O(log N) too at most.

A tree is considered well balanced if no leaf node is 2 nodes deeper than another. Look up AVL trees to get a rabalancing algorithm.

It would actually be good if the standard library "opened" the use of the trees used for std::set and std::map as a public interface for use in custom trees.

CashCow
  • 30,981
  • 5
  • 61
  • 92
  • @CashCow: You are correct that I misread. I removed my comment and downvote. – Mooing Duck Feb 09 '12 at 21:44
  • What you describe *is* `std::set`. If the order of the elements must not be altered, then it may be the right solution if the number of elements is expected to grow. If you only have a few elements, `std::vector` + `erase` is also ok (and might be faster than `set`). – Alexandre C. Feb 09 '12 at 21:51
  • No, std::set requires elements being sorted and unique. There is no need for either of those here. Of course it might work with a map if map finds the nth element in O(log N) time (and it should) and if you create for each element some additional "key" that causes the element to insert where you want it. I am looking at an algorithmic point of view, i.e. you can random access remove (or insert) and maintain order in O(log N) complexity. – CashCow Feb 09 '12 at 21:56
  • Yes, I figured this out afterwards. It seems that there is indeed no way to access the nth element of a set in O(log n), as it should be possible. Nevertheless, I would not reimplement STL trees here, and try to find another solution than accessing elements by index. – Alexandre C. Feb 09 '12 at 22:23
  • See also http://en.wikipedia.org/wiki/AA_tree, which seems simpler to implement than RB trees – Alexandre C. Feb 09 '12 at 22:24
  • I was treating this more as a question of algorithmic complexity than a coding exercise in C++. As set and map both use a binary tree implementation it would be nice if the implementation was standardised and made available: well the interface to tree anyway, that enabled you to traverse, add nodes and rebalance without having to write this logic yourself. – CashCow Feb 09 '12 at 23:17