-1

I need to stable sort an array with qsort. To ensure that the result is stable, I add one extra condition to the compare function:

int compare(const void *p1, const void *p2)
{
    if(*(const Data*)p1 < *(const Data*)p2)
        return -1;
    if(*(const Data*)p2 < *(const Data*)p1)
        return 1;
    else
        return p1<p2 ? -1 : 1;
}

This will work if qsort never calls compare(p,p). Otherwise I need to use a more complex condition. The question is, whether qsort ever calls compare() with a duplicate pointer or does it always compare different pointers?

UPDATE:

I checked this with Ideone C++ compiler: https://ideone.com/l026kM

For the small example { 8, 8, 1, 1 } in the comments, the provided implementation of qsort() does not change the order of pointers and does not call compare for the same element. This seems reasonable, because every reverse swap will affect performance, since it needs swapping back later on. I will test this with randomly generated arrays and different compilers.

UPDATE:

Tested on Ideone for 100000 random arrays with minimum 80% share of repetitive keys. The result is 100% stable sorted arrays. Here's the link: https://ideone.com/KOYbgJ

VC++Express 2008 compiler fails stable sort, because the order of the pointers is changed. This basically shows how VC++ implementation is different from GCC implementation in that it does not keep pointer order.

bkxp
  • 1,115
  • 1
  • 12
  • 20
  • Why would any sorting algorithm ever compare an element with itself? – Henri Menke Feb 21 '18 at 07:31
  • 5
    Ever heard of `std::stable_sort`? –  Feb 21 '18 at 07:31
  • Henri, yours is a philosophical question. I am asking about the specification of qsort that *should* be followed in any implementation however clumsy it may be. – bkxp Feb 21 '18 at 07:32
  • 5
    Why are you even using the C library’s qsort in what is supposedly a C++ program? – Paul R Feb 21 '18 at 07:32
  • Nicky, yes I've heard about stable_sort, of course. But I need shallow swap of structures, and it does not provide that. – bkxp Feb 21 '18 at 07:33
  • What do you mean by "shallow swap of structures"? –  Feb 21 '18 at 07:35
  • @bkxp If I am not mistaken the qsort provided in the question compares the pointer values and sorts the objects according to their addresses. `std::stable_sort(begin, end)` would do the same. – Jens Feb 21 '18 at 07:37
  • 4
    BTW, I think this is an XY problem. The property of the comparator does not affect the stability guarantee of a sorting algorithm. –  Feb 21 '18 at 07:41
  • Note that `stable_sort` is typically implemented as some form of _out-of-place_ mergesort, therefore there must be enough free memory available to run it (i.e., the same amount of memory as sorted data occupies). If you have not enough memory available and need stability, then you are in serious troubles ;-) – Daniel Langr Feb 21 '18 at 07:44
  • Nicky, I mean that qsort copies content as raw data while std::stable_sort uses assignment operator which gives too much unnecessary overhead. – bkxp Feb 21 '18 at 07:47
  • @bkxp How do you know what implementations of `qsort` and `std::stable_sort` use? – Daniel Langr Feb 21 '18 at 07:52
  • Nicky, How can a comparator not affect stability? You can append index as the lower bits to any value, which makes the keys different, or even generate keys dynamically in the comparator provided that the comparator does that consistently. – bkxp Feb 21 '18 at 07:52
  • 1
    Why do you even assume that? Have you measured it? Is your element type even trivially copyable? –  Feb 21 '18 at 07:53
  • Doing so is hacking the element comparison, not changing the sort stability. –  Feb 21 '18 at 07:55
  • 2
    @bkxp Learn some theory about sorting. You cannot sort if sorting keys of elements depend on their positions. – Daniel Langr Feb 21 '18 at 07:55
  • Daniel, I know about stable_sort, because I specifically searched for this information. std::sort and std::stable_sort are not guaranteed to even call swap, they can use assignment on the elements. https://stackoverflow.com/questions/14212701/stdsort-does-not-always-call-stdswap – bkxp Feb 21 '18 at 07:55
  • Nicky, of course I know if my element type is trivially copyable, because this is not a template. – bkxp Feb 21 '18 at 07:57
  • 2
    @bkxp Sure, they can. And what does it implies about performance? Nothing. Both raw copying and assignment can lead to the exactly same machine code. – Daniel Langr Feb 21 '18 at 07:57
  • Daniel, "can lead" does not mean "always lead", right? As for the sorting theory, yes, now I see where I made a mistake. I am comparing the current pointers, and not the original pointers. I need to compare original pointers for this to work. – bkxp Feb 21 '18 at 07:58
  • 1
    @bkxp And what makes you think that machine code generated for raw copy will be faster than for assignment, even if they are different? This is only guessing, always measure first. I would suggest to try `std::stable_sort` and your recently proposed approach with _original pointers_ added to `Data` and compare runtimes. – Daniel Langr Feb 21 '18 at 08:03
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/165541/discussion-between-bkxp-and-daniel-langr). – bkxp Feb 21 '18 at 08:06
  • 1
    Quicksort spends its entire time comparing elements against the pivot, which is itself an element. Sedgewick showed in the 1970s that it is actually better to use >= or <= rather than > or < at a particular phase, so you can practically guarantee that it will compare an element wih itself. – user207421 Feb 21 '18 at 10:30
  • @Jarod42 Thanks, fixed. By comparing with itself I meant passing the same pointers, not the same values with different pointers. – bkxp Feb 22 '18 at 09:36

2 Answers2

7

Anything that is not explicitly forbidden is implicitly allowed; in particular, I recall that VC++ in debug builds explicitly tests the comparer over the same element (among other sanity tests) before actually carrying out an std::sort. Don't know if it does the same with qsort, but it is allowed to.

But most importantly, your comparer breaks the requirements mandated by qsort; in particular:

When the same objects (consisting of size bytes, irrespective of their current positions in the array) are passed more than once to the comparison function, the results shall be consistent with one another. That is, for qsort they shall define a total ordering on the array, and for bsearch the same object shall always compare the same way with the key.

(C99, §7.20.5 ¶4, emphasis added)

Finally, as already outlined by Daniel Langr, even with a "forgiving" implementation this will not necessarily achieve what you are striving for.

That's to say: throw away this kludge and use a real stable sort algorithm, the library provides it already (std::stable_sort).


Also, since the whole point of this thing seems to be to have a faster sort than std::stable_sort and std::sort due to qsort avoidance of assignment operator:

BTW qsort is generally slower than std::sort for cheap comparers because it requires an indirect call for each compare, while in std::sort the functor gets inlined. and copying is often slower as well, as std::sort has to use memcpy (which has to be called and then to determine at runtime the size and copy accordingly), while an inlined assignment gets inlined (so again, it's way cheaper if your elements are small, and pretty much the same otherwise, as the synthesized assignment/copy of trivially assignable types generally boils down to memcpy

(chat link)

Matteo Italia
  • 123,740
  • 17
  • 206
  • 299
  • But I can still compare the original indexes of elements, if they are included in element values, right? This is no kludge then and has to ensure stable sorting with qsort. – bkxp Feb 21 '18 at 08:18
  • Yes, that is allowed, as it's data that do not depend from the current position in the container. – Matteo Italia Feb 21 '18 at 08:22
6

This will work if qsort never calls compare(p,p).

No, this won't ensure the stability of quicksort. It would be nice if it did :)


Consider partitioning of (8,8,1,1) with respect to pivot 5. First, outer 8 is swapped with outer 1, then inner 8 is swapped with inner 1. The order of both 1s and 8s is changed after partitioning while no elements with same keys has been compared.


It can be written more explicitly as follows:

partition(8_left, 8_right, 1_left, 1_right) -> (1_right, 1_left, 8_right, 8_left)

Daniel Langr
  • 22,196
  • 3
  • 50
  • 93
  • Why not? If two elements are equal, then the earliest element is preferred. – bkxp Feb 21 '18 at 07:36
  • @bkxp It's not just about comparing elements with same keys. See my updated answer. – Daniel Langr Feb 21 '18 at 07:40
  • You mean that the algorithm only compares 1 and 8 and never gets to comparing the pointers, right? But you are assuming the data is integer. However, the correct integer analogy would be: (8, 9, 1, 2). Now could you please explain how will 8 and 9 not be compared by qsort at least once? – bkxp Feb 21 '18 at 07:42
  • 1
    @bkxp If all your elements have different keys, then don't care about stability. Stability is important when sorted data contain duplicate keys. – Daniel Langr Feb 21 '18 at 07:46
  • 2
    Also, it breaks strict weak ordering, resulting in undefined behavior, as the ordering between two equivalent elements comes to depend from where they are currently located in the array instead of their value. Having a[0] – Matteo Italia Feb 21 '18 at 07:47
  • Matteo, does qsort make swaps that place the smaller element before the larger element of the same pair? If not, then in any swap, the pointer order will match the initial order of elements in the array. – bkxp Feb 21 '18 at 08:01
  • 1
    @bkxp Sure, whole partitioning, i.e., quicksort is based on such swapping. – Daniel Langr Feb 21 '18 at 08:05
  • But I can compare the original array indexes if they are present in the data structure. Will this provide stability to qsort regardless of the implementation? – bkxp Feb 21 '18 at 08:16
  • The above comment should read 'place the larger element before the smaller element'. There is absolutely no point in reversing the element order at any step, because it only adds extra iterations to rollback this reversal and leads to performance drop. So if element order is never reversed in any swap, then comparing current pointers will always yield the same results as comparing the original pointers. – bkxp Feb 21 '18 at 08:57
  • 1
    @bkxp No, it won't. Have you read my answer? Swaps are placing smaller elements before larger elements and the stability is still broken. There are no pointers compared there, since keys are different in both comparisons. – Daniel Langr Feb 21 '18 at 09:01
  • @bkxp How can you sort elements without reversing their order? How would you sort (8,1) array without reversing them? What's the difference between (8,1) and (8,8,1,1) which I used to demonstrate instability? – Daniel Langr Feb 21 '18 at 09:21
  • @Daniel Langr Of course you have to revert (8,1) to get (1,8), but you do not have to revert (1,8) to get (8,1). If the comparison yields -1, the caller code does not have to change the order of the elements. This has the side-effect of making pointer order invariant if you include pointer comparison in the compare() function. Consider the extreme case in which all elements are the same, e.g. 1. Only their pointers will be different. Will the algorithm swap any elements with the same value? Here is a test code on Ideone: https://ideone.com/l026kM – bkxp Feb 21 '18 at 09:29
  • @Daniel Langr, I have to add that the above test is meant to show how qsort() is actually implemented in a standard library, and not about the formal requirements for qsort(). I found the requirements online and they do not contain restriction on reverse ordering at any iteration. – bkxp Feb 21 '18 at 09:30
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/165548/discussion-between-bkxp-and-daniel-langr). – bkxp Feb 21 '18 at 09:53