0

let's define

struct A
{ 
int m;
...
}

std::vector<A> vec; //a large vector (magnitude of million members)

I am interested to find top 2% member of vec where the members have highest value in their m method.

For that I was thinking something like this:

std::multimap<int, A> top_members;

const auto nr_top = std::lround(vec.size() * 0.02);
for (auto it = vec.begin(); it != vec.end(); ++it)
  {
  if(top_members.size() == nr_top + 1)
    top_members.erase(std::prev(top_members.end()));
  else
    top_members[it->m] = *it
  }

Do you think of any faster solution?

H'H
  • 1,638
  • 1
  • 15
  • 39

3 Answers3

3

It looks like you want partial_sort. It will sort 2% of your items, and the remaining 98% will be in essentially random order.

MSalters
  • 173,980
  • 10
  • 155
  • 350
2

std::partial_sort has a complexity of n log m where n is the number of elements and m is the number elements you want sorted. You can use this to sort just the top 2%.

const auto nr_top = std::lround(vec.size() * 0.02);
std::partial_sort(vec.begin(), vec.begin() + nr_top, vec.end(),
  [](auto a, auto b){return a > b}
);
// the top 2% will be at the front of the vector

If you need the top 2% in a different container, you can use std::partial_sort_copy instead.

Philip Nelson
  • 1,027
  • 12
  • 28
0

If you can modify the vector and the elements inside are not already sorted, you can make use of std::nth_element:

template< class RandomIt, class Compare >
void nth_element( RandomIt first, RandomIt nth, RandomIt last,
                  Compare comp );

nth_element is a partial sorting algorithm that rearranges elements in [first, last) such that:

  • The element pointed at by nth is changed to whatever element would occur in that position if [first, last) were sorted.
  • All of the elements before this new nth element are less than or equal to the elements after the new nth element. ...

Complexity: Linear in std::distance(first, last) on average.

So that, given a std::vector<A> v, in your case you could write

size_t top_two = std::llround(v.size() * 0.02);

std::nth_element(v.begin(),
                 v.begin() + top_two,
                 v.end(),
                 [](A const& a, A const& b){ return b.m < a.m; });
Bob__
  • 12,361
  • 3
  • 28
  • 42