7

I'm a student working on a small project for an high performance computing course, hence efficiency it's a key issue.

Let say that I have a vector of N floats and I want to remove the smallest n elements and the biggest n elements. There are two simple ways of doing this:

A

sort in ascending order    // O(NlogN)
remove the last n elements // O(1)
invert elements order      // O(N)
remove the last n elements // O(1)

B

sort in ascending order     // O(NlogN)
remove the last n elements  // O(1)
remove the first n elements // O(N)

In A inverting the elements order require swapping all the elements, while in B removing the first n elements require moving all the others to occupy the positions left empty. Using std::remove would give the same problem.

If I could remove the first n elements for free then solution B would be cheaper. That should be easy to achieve, if instead of having a vector, i.e. an array with some empty space after vector::end(), I would have a container with some free space also before vector::begin().

So the question is: does exist already an array-like (i.e. contiguous memory, no linked lists) in some libraries (STL, Boost) that allows for O(1) inserting/removing on both sides of the array?

If not, do you think that there are better solutions than creating such a data structure?

Daniel Heilper
  • 1,182
  • 2
  • 17
  • 34
Fabio
  • 644
  • 1
  • 8
  • 17
  • 1
    @40two: I think `deque` is a linked-list in behind. – masoud May 23 '14 at 09:02
  • 2
    From cpluplus: "the elements of a deque can be scattered in different chunks of storage". Is still a very good suggestion, though! – Fabio May 23 '14 at 09:03
  • @MM. Nice comment, I agree. I don't known if is implemented as a linked list though. – 101010 May 23 '14 at 09:09
  • 1
    A deque is actually a list (or an array) of chunks, so its performance will be much better than a simple linked list and really close to (sometimes better than) a vector. See http://stackoverflow.com/questions/6292332/what-really-is-a-deque-in-stl and http://baptiste-wicht.com/posts/2012/12/cpp-benchmark-vector-list-deque.html for example. – Ivan Vergiliev May 23 '14 at 09:14
  • Why is a linked-list-like container out of question ? – Chnossos May 23 '14 at 09:14
  • 1
    Several reasons. Mainly: 1) I need to pass the vector to BLAS functions 2) Efficient cache usage (the floats are cache aligned) 3) Future porting of the code in CUDA reasons 2 and 3 may be secondary, but in order to use BLAS there's no way, I need an array. – Fabio May 23 '14 at 09:20

4 Answers4

6

Have you thought of using std::partition with a custom functor like the example below:

#include <iostream>
#include <vector>
#include <algorithm>

template<typename T>
class greaterLess {
    T low;
    T up;
  public:
    greaterLess(T const &l, T const &u) : low(l), up(u) {}
    bool operator()(T const &e) { return !(e < low || e > up); }
};

int main()
{
    std::vector<double> v{2.0, 1.2, 3.2, 0.3, 5.9, 6.0, 4.3};
    auto it = std::partition(v.begin(), v.end(), greaterLess<double>(2.0, 5.0));
    v.erase(it, v.end());

    for(auto i : v) std::cout << i << " ";
    std::cout << std::endl;

    return 0;
}

This way you would erase elements from your vector in O(N) time.

101010
  • 41,839
  • 11
  • 94
  • 168
  • this is only a partial solution. You don't tell how to find the nth smallest/biggest element for the partition – Daniel Heilper May 23 '14 at 12:43
  • I'll accept this answer as it is the one that I'll probably use. But please mind that, as @hellfire769 pointed out, this solution doesn't properly answer the question. – Fabio May 24 '14 at 09:48
2

Try boost::circular_buffer:

It supports random access iterators, constant time insert and erase operations at the beginning or the end of the buffer and interoperability with std algorithms.

Having looked at the source, it seems (and is only logical) that data is kept as a continuous memory block.

The one caveat is that the buffer has fixed capacity and after exhausting it elements will get overwritten. You can either detect such cases yourself and resize the buffer manually, or use boost::circular_buffer_space_optimized with a humongous declared capacity, since it won't allocate it if not needed.

gwiazdorrr
  • 6,181
  • 2
  • 27
  • 36
  • This solution is interesting, however I'm afraid that the order in which the elements are stored is different from the order in which they were inserted. That is might be *begin > *end. Is true though that checking for *begin to be always less than *end should solve this caveat. – Fabio May 24 '14 at 09:42
2

To shrink & grow a vector at both ends, you can use idea of slices, reserving extra memory to expand into ahead of time at front and back, if efficient growth is needed.

Simply, make a class with not only a length but indices for first & last elements and a suitably sized vector, to create a window of data on the underlying block of stored floats. A C++ class can provide inlined functions, for things like deleting items, address into the array, find the nth largest value, shift the slice values down or up to insert new elements maintaining sorted order. Should no spare elements be available, then dynamic allocation of a new larger float store, permits continuing growth at the cost of an array copy.

A circular buffer is designed as a FIFO, with new elements added at end, removal at front, and not allowing insertion in the middle, a self defined class can also (trivially) support array subscript values different from 0..N-1

Due to memory locality, avoiding excessive indirection due to pointer chains, and the pipelining of subscript calculations on a modern processor, a solution based on an array (or a vector), is likely to be most efficicent, despite element copying on insertion. Deque would be suitable but it fails to guarantee contiguous storage.

Additional supplementary info. Researching classes providing slices, finds some plausible alternatives to evaluate :

A) std::slice which uses slice_arrays B) Boost Class Range

Hope this is the kind of specific information you were hoping for, in general a simpler clearer solution is more maintainable, than a tricky one. I would expect slices and ranges on sorted data sets, being quite common, for example filtering experimental data where "outliers" are excluded as faulty readings.

I think a good solution, should actually be - O(NlogN), 2xO(1), with any binary searches O(logN +1) for filtering on outlying values, in place of deleting a fixed number of small or large values; it matters that the "O" is relatively fast to, sometimes an O(1) algorithmn can be in practice slower for practical values of N than an O(N) one.

Rob11311
  • 1,396
  • 8
  • 10
  • This answer is very precise but doesn't actually add any relevant information. – Fabio May 24 '14 at 09:49
  • Researching, using phrase slice & vector you'ld find, less obscure solutions than other proposals, which are very likely to be more efficient than repurposing data structures designed for other problems. IMO it's a matter of taste whether using vector as a building block for you own class, is the best solution; my reading of the question was that you actually really knew the right solution. – Rob11311 May 25 '14 at 15:15
1

as a complementary to @40two 's answer, before partitioning the array, you will need to find the partitioning pivot, which is you will need to find the nth smallest number, and the nth greatest number in an unsorted array. There is a discussion on that in SO: How to find the kth largest number in unsorted array

There are several algorithms to solve this problem. Some are deterministic O(N) - on of them is a variation on finding the median (median of medians). There are some non-deterministic algorithms with O(N) average-case. A good source book to find those algorithms is Introduction to algorithms. Also in books like

So eventually, your code will run in an O(N) time

Community
  • 1
  • 1
Daniel Heilper
  • 1,182
  • 2
  • 17
  • 34