2

I've almost understood many STL algorithms until I've reached the algorithm std::nth_element. I 'm stuck on it; I don't know how it works and it does do exactly.

For education and understanding sake can someone explain to me how the algorithm std::nth_element works?

std::vector<int> v{ 9, 3, 6, 2, 1, 7, 8, 5, 4, 0 };
std::nth_element(v.begin(), v.begin() + 2, v.end());

for (auto i : v)
    std::cout << i << " ";
std::cout << '\n';

The output:

1 0 2 3 6 7 8 5 4 9 
  • So where is nth element here?
  • How and what the algorithm does?
  • Does it do some sort of partial sorting?

Here is some explanation from cppreference.com:

nth_element is a partial sorting algorithm that rearranges elements in [first, last) such that:

  • The element pointed at by nth is changed to whatever element would occur in that position if [first, last) was sorted.
  • All of the elements before this new nth element are less than or equal to the elements after the new nth element. More formally, nth_element partially sorts the range [first, last) in ascending order so that the condition !(*j < *i) (for the first version, or comp(*j, *i) == false for the second version) is met for any i in the range [first, nth) and for any j in the range [nth, last). The element placed in the nth position is exactly the element that would occur in this position if the range was fully sorted.

nth may be the end iterator, in this case the function has no effect.

  • I am still confused about it. What is nth element and how to implement a possible algorithm like that?. For education sake I've mimicked many STL algorithms. Thank you so much!
Itachi Uchiwa
  • 3,044
  • 12
  • 26
  • `So where is nth element here?` What do you mean by "where"? `How and what the algorithm does?` Exactly what is stated in the documentation you cited. `Does it do some sort of partial sorting?` Well... `nth_element is a partial sorting algorithm` – tkausl Sep 18 '21 at 00:14
  • If you have trouble understanding the documentation, please tell us which lines/statements you don't understand. – tkausl Sep 18 '21 at 00:15
  • @tkausl: All the documentation sorry. – Itachi Uchiwa Sep 18 '21 at 00:15
  • 3
    Perhaps a more enlightening question to answer would be "under what circumstances would someone find calling `nth_element()` to be useful"? (presumably the function wasn't written just to make the STL larger; there must have been some common problem that someone wanted to solve that prompted them to write it and include it in the STL) – Jeremy Friesner Sep 18 '21 at 00:18
  • `1 0 2 3 6 7 8 5 4 9 ` the nth element in my example is 6 but 4 and 5 are in the [nth, last) and less than nth element?!!! – Itachi Uchiwa Sep 18 '21 at 00:30
  • 1
    @ItachiUchiwa: The nth element refers to position, not value. `v.begin() + 2` is the third element (index `2`, 0-based). If the whole array was sorted, `2` would appear at that position, and `nth_element` makes that happen. The position of all the other elements is semi-randomized, aside from the guarantee that all elements less than the one that ends up at index `2` come before it, and all those larger than it come after it. [Introselect](https://en.wikipedia.org/wiki/Introselect) is the recommended algorithm. – ShadowRanger Sep 18 '21 at 00:40
  • 1
    @JeremyFriesner: I imagine it has use cases similar to that of `partial_sort`. You need to separate largest and smallest elements at some ranking threshold, but don't need the elements sorted to get useful results. For example, to get the mean of the middle 90% of a dataset, you need to separate the 5% on either side as outliers, but the middle 90% don't need to be sorted. You could use `nth_element` once to separate the bottom 5%, then again (on the 95% right of the pivot) to separate out the top 5%. Then a linear pass to compute the mean. Three `O(n)` steps, no `O(n log n)` sort at all. – ShadowRanger Sep 18 '21 at 00:49

2 Answers2

7

So where is nth element here?

The n-th element is the 2 at index 2 because thats what you asked for when you passed begin()+2.

The element pointed at by nth is changed to whatever element would occur in that position if [first, last) was sorted.

This means that, if the vector was sorted, the order of elements would be

0 1 2 3 4 5 6 7 8 9 
    ^--- begin() + 2

You asked to have 3rd largest element at index 2 (3rd position), and thats what the algorithm does.

In addition it puts all elements smaller in the front and all elements larger in the back:

!(*j < *i) (for the first version, or comp(*j, *i) == false for the second version) is met for any i in the range [first, nth) and for any j in the range [nth, last).

Let's use indices rather than iterators, then for any i < 2 and any j > 2 it holds that v[i] < v[j]. In other words, 1 and 0 are both smaller than any element in 2 3 6 7 8 5 4 9.

463035818_is_not_an_ai
  • 109,796
  • 11
  • 89
  • 185
  • But the value at `v.begin() + 2` is `6` not `2`? 2 is in my output not in the original sequence. – Itachi Uchiwa Sep 18 '21 at 00:31
  • 1
    @ItachiUchiwa "The element pointed at by nth is changed to whatever element would occur in that position if [first, last) was sorted." The 2 is the value that would appear at index 2 **after** sorting the vector – 463035818_is_not_an_ai Sep 18 '21 at 00:32
  • 1
    @Itachi Uchiwa see edit – 463035818_is_not_an_ai Sep 18 '21 at 00:35
  • One last thing: can I implement a similar algorithm for education purpose? – Itachi Uchiwa Sep 18 '21 at 00:39
  • 1
    @ItachiUchiwa Can you? I cannot answer that. There are links to implementations on the cppreference page: [libstdc++](https://github.com/gcc-mirror/gcc/blob/d9375e490072d1aae73a93949aa158fcd2a27018/libstdc%2B%2B-v3/include/bits/stl_algo.h#L4718) and [libc++](https://github.com/llvm-mirror/libcxx/blob/a12cb9d211019d99b5875b6d8034617cbc24c2cc/include/algorithm#L5109) – 463035818_is_not_an_ai Sep 18 '21 at 00:42
-1

I will explain my code first before getting into your problem

for example I have code like this

int m_array_biasa[8] {3,2,10,45,33,56,23,47};

and I use it normally like

std::nth_element(m_array_biasa, m_array_biasa + 4, m_array_biasa + 8);

so nth element in here is 4[33], the rule of std::nth_element is that the number on the left of nth must be less than or equal to and the number on the right must be greater than nth

and don't forget, data must be sorted from small to large (default)

so the data that was originally

3,2,10,45,33,56,23,47

changed to

2 3 10 23 33 45 47 56

my nth is 4[33], so the above rules apply (without including the sorting result)

and the result is

3 2 10 23 33 56 45 47

note above, position 33 has not changed, but sometimes it's a bit confusing, for example we change 33 to 1, then the result

2 1 3 10 23 45 47 56

what happened here, why did the number 1 move (replaced by 23), why didn't it stay like the previous number 33, I said before that we have to sort the data first (see sorting above), it turns out that index nth[4] is number 23 , then the number 1 is replaced with the number 23, why should it be replaced?, see the nth_element rule

now on to your question.

std::vector<int> v{ 9, 3, 6, 2, 1, 7, 8, 5, 4, 0 };
std::nth_element(v.begin(), v.begin() + 2, v.end());

v.begin() contains 9, v.begin() + 2 contains 6, remember, nth_element will sort it first

0 1 2 3 4 5 6 7 8 9

and your output is

1 0 2 3 6 7 8 5 4 9

the nth[2] above (acccording your v.begin() + 2)p is 2, so 2 here is like a reference for other data, the data before 2 must be less than it, and after it must be more than it

Sam Smith
  • 11
  • 3