12

I was wondering why does the c++ standard require that std::sort should only take random-access iterators? I don't see the advantage, since both std::sort and std::list::sort have a complexity of N*log(N). Restricting std::sort to random-access iterators (RAI) seems to have made it necessary to write a separate function for lists with the same complexity.

The same applies to partial_sort, where the non-RAI counter-part for list is simply missing to this day.

Is this design because people used variants of quick_sort to implement std::sort historically?

If there is advantage for writing sort algorithms on RAI containers, would it better to make std::sort more general, and let RAI containers like std::vector provide specialized v.sort?

Community
  • 1
  • 1
thor
  • 21,418
  • 31
  • 87
  • 173
  • You say people used quicksort to implement std::sort "historically." What do you think they use today? – John Zwinck Jul 18 '14 at 04:37
  • 1
    @JohnZwinck They can't use (unmodified) quicksort today. The C++11 standard mandates `std::sort` to be `O(N log N)` in all cases, quicksort is `O(N^2)` worst case. – T.C. Jul 18 '14 at 04:39
  • @JohnZwinck: Are you sure that in C++ `O(N log N)` is different from `O(N^2)` ? Note that in C++ `N` has an upper bound. – 6502 Jul 18 '14 at 04:41
  • 8
    @6502: You mean `SIZE_MAX`? By that logic, *all* algorithms (that halt) are O(1). But that's not a very useful way to describe complexity. – dan04 Jul 18 '14 at 04:58
  • @dan04: It's not "that" logic or "this" logic. I'm just saying that using big-O notation when the input size is limited is (formally) a nonsense (unless you're using a logic that is incompatible with mathematics). From a **practical** point of view it's a great idea and much easier to think with than using a formally correct approach (e.g. providing explicit polynomials like Knuth does in TAOCP) but I wouldn't put too much emphasis on it. – 6502 Jul 18 '14 at 05:13
  • 2
    @6502 You're right in principle, but most of us tacitly agree to only compute asymptotic behavior up to a point before the asymptote runs off the graph and on toward infinity. Explicit polynomials are fundamentally no different; coefficients are added back but they're only valid up to a point. If you're really building a model for real-world software, you'll account for memory cache effects and whatever else — using empirical measurements and profiling as a basis, not abstract mathematics. – Potatoswatter Jul 18 '14 at 05:46
  • See [wikipedia's entry for C++ sort](http://en.wikipedia.org/wiki/Sort_%28C%2B%2B%29). – juanchopanza Jul 18 '14 at 06:09
  • @Potatoswatter: there are two related problems with big-O notation once you try to move outside theoretic computer science: the first is that N cannot go to infinity and thus it doesn't make sense (formally) to talk about asymptotic behavior or limits. The second is that once you remove infinite, the **arbitrary** constant plays a role. Your "tacit assumption" simply means that you're not talking about big-O any more. In C++ there are formally established implementation-defined limits thus it's not even necessary to talk about practical finiteness of resources. – 6502 Jul 18 '14 at 09:30
  • @Potatoswatter: explicit polynomials are a formally correct method. They're about the number of cycles of the MIX ideal processor and they accurately describe time complexity (in that virtual world). Note that no arbitrary constants are introduced; cache effects are not included because the original MIX didn't model it. Describing time complexity in terms of polynomials for the count of elementary operations however would have been annoying for implementers and that's why abusing big-O is in my opinion a great idea. Still I wouldn't try to read too much in big-O for C++, foundation is weak. – 6502 Jul 18 '14 at 09:39
  • In case you want to write your own sorting routines for forward/bidirectional iterators: http://stackoverflow.com/q/24650626/819272 – TemplateRex Jul 18 '14 at 12:12
  • @6502 You seem to be dancing around the point. Like anything, a realization of MIX is ultimately limited by its connected storage to either halt within a certain (very long) time, or loop forever. Using big-O to describe real-world complexity isn't an abuse, but as an approximation it does only hold for "big, but not too big" inputs. Polynomials merely loosen that requirement to "within the modeled range, and not too big." – Potatoswatter Jul 19 '14 at 05:49

2 Answers2

4

O(N*log(N)) complexity doesn't imply that the container is iterated in order, nor that the changes to it are only made in scan order. To use sequential iterators would come at a O(N) memory cost to store all of those iterators.

Kuba hasn't forgotten Monica
  • 95,931
  • 16
  • 151
  • 313
0

Algorithm complexity doesn't say it all. Actually in C++ (from a formal point of view) it doesn't say anything because you cannot grow N to infinity (size_t isn't an arbitrary precision number) thus every sort algorithm written in C++ is (formally) also O(1).

From a practical point of view std::sort is the grandson of qsort and it's implemented most probably as a variation of quicksort.

Using merge sort for arrays would require an extra storage proportional to the size of the array (the link to next element) while merge sort for lists doesn't require any extra space because you can reuse the link already present in the node (that it's going to be destroyed by sorting anyway).

The idea of programming without knowing what kind of container you're dealing with is mostly an illusion, thus having two different explicit ways to sort efficiently two different types of containers is not considered bad per se.

It's indeed annoying that std::sort doesn't contain a specialization also for list iterators (I'm not a template metaprogramming guru but it would seem quite straightforward to do).

6502
  • 112,025
  • 15
  • 165
  • 265
  • "annoying that `std::sort` doesn't contain a specialization also for list iterators" -- aside from anything else, this would let you sort a sub-range of a `list`, which `list::sort` doesn't allow directly. – Steve Jessop Jun 27 '15 at 01:14