Hoare partitioning algorithm - strict or not strict inequalities?

Question

I'm having a really hard time with the implementation of Hoare's partition algorithm. Basically, what i want to do is split an array into two parts, the first one containing numbers lesser than the given x, and the other one containing the greater. However, I just can't figure out a good implementation. This is my code:

void hoare(vector<int>&arr,int end, int pivot)
{
    int i = 0;
    int j = end;

    while (i < j)
    {
        while (arr[i] < pivot)
            i += 1;

        while (arr[j] > pivot)
            j -= 1;            

        swap(arr[i],arr[j]);
    }

    // return arr;
    for (int i=0; i<end; i++)
    printf("%d ", arr[i]);
}

Now I've found out that loads of sites have while (arr[i] <= pivot) instead of what I put down there. However, when I do that, for an array like this:

1 3 5 7 9 2 4 6 8

I get:

1 3 5 4 9 2 7 6 8

But then again, in my version, for such a set:

12 78 4 55 4 3 12 1 0

the program freezes, because since neither condition in the outer loop is fulfilled, it just goes through it over and over again, without incrementing j or i.

The pivot is a pointer to a specific number in the array, counting from 1; for instance, number 3 passed to function in the first example would mean the pivot equals arr[2], which is 5

Sorry if that's a noob question or has already been answered, but I've spent the whole day on this (also searching the net for a solution) to no avail and now I'm having suicidal thoughts.

Thanks in advance.

The pivot is a pointer to a specific number in the array, counting from 1; for instance, number 3 passed to function in the first example would mean the pivot equals arr[2], which is 5. — user4520, Dec 25 '12 at 18:56
could you update your question to add that information? and also add which values you actually used? — didierc, Dec 25 '12 at 19:01
You seem confused about what `pivot` is. If it's an index in the array then it doesn't make sense to compare it with an element of the array like your code does in `arr[i] < pivot`. Imagine the array being an array of strings, you would be comparing a string with an integer. — 6502, Dec 25 '12 at 19:01
related: http://stackoverflow.com/questions/7198121/quicksort-and-hoare-partition — jfs, Dec 25 '12 at 19:03
have a look at [this SO question](http://stackoverflow.com/questions/7198121/quicksort-and-hoare-partition). — didierc, Dec 25 '12 at 19:05
If neither `i` nor `j` changes on an iteration, you end up with an infinite loop, so the termination criterion demands that at least one of `i` or `j` is changed so as to reduce the total work on each iteration of the outer loop. That is likely why the condition is `<=` or `>=` instead of `<` or `>`. The x-ref'd [question](http://stackoverflow.com/q/7198121/15168) uses `do ... while()` loops which are guaranteed to execute once and the `...` action is an increment of `i` or decrement of `j`, which ensures that the outstanding work decreases on each iteration of the outer loop. — Jonathan Leffler, Dec 25 '12 at 20:00

Dietmar Kühl · Answer 1 · 2012-12-25T21:41:53.310

2

The simple answer to partion a sequence is, of course, to use

auto it = std::partition(vec.begin(), vec.end(),
                         std::bind2nd(std::less<int>(), pivot));

The function doesn't really care about the predicate but rearranges the sequence into two sequences: one for which the predicate yields true and one for which the predicate yields false. The algorithm returns an iterator to the end of the first subsequence (consisting of the elements for which the predicate is true). Interestingly the algorithm is supposed to work on forward iterators (if it really gets forward iterators it can use quite a number of swaps, though). The algorithm you are implementing clearly requires bidirectional iterators, i.e., I'll ignore the requirement to also work for forward sequences.

I'd follow exactly the same interface when implementing the algorithms because the iterator abstraction works very well for sequence algorithms. The algorithm itself simply employs std::find_if() to find an iterator it in the range [begin, end) such that the predicate does not hold:

it = std::find_if(begin, end, not1(pred));

If such an iterator exists it employs std::find_if() to search in [std::reverse_iterator<It>(end), std::reverse_iterator<It>(it)) for an iterator rit such that the predicate does hold:

rit = std::find_if(std::reverse_iterator<It>(end), std::reverse_iterator<It>(it),
                   pred);

If such an iterator exists, it std::swap()s the corresponding locations and updates begin and end accordingly:

std::swap(*it, *rit);
begin = ++it;
end = (++rit).base();

If either it or rit isn't found, the algorithm terminates. Putting this logic into a consistent algorithm seems to be rather straight forward. Note that this algorithm can't even use the operator you try to use, i.e., conceptually elements can only be compared for x < pivot and x >= pivot (which is identical to !(x < privot)).

The implementation below isn't tested but the complete algorithm would look something like this:

template <typename BiIt, typename Pred>
BiIt partition(BiIt it, BiIt end, Pred pred) {
    typedef std::reverse_iterator<BiIt> RIt;
    for (RIt rit(end);
         (it = std::find_if(it, end, std::not1(pred))) != end
         && (rit = std::find_if(RIt(end), RIt(it), pred)) != RIt(it);
         ++it, end = (++rit).base()) {
         std::swap(*it, *rit);
    }
    return it;
}

edited Dec 25 '12 at 21:41

answered Dec 25 '12 at 19:08

Dietmar Kühl

150,225
13
225
380

Ok, I'll express myself more clearly. Basically, I have an array of positive integers, then, given the position of a number (let's call it x) in that set, I should arrange them in such an order that all numbers lesser than x precede it, whereas any number greater than x should be put farther in the array. If, by chance, there exists another number, equal to x, it should be put next to it. For instance, for the following input: 3 1 3 5 7 9 2 4 6 8 the answer should look like this: 3 1 4 2 5 9 6 8 7 Where number 3 in the first line of the input means the position o – user4520 Dec 25 '12 at 19:49
@user1928235: I understand how partition works although your requirement that equal numbers are specifically arranged is normally not required and actually not supported by the nature of the algorithm. It can be achieved using two partitions, though: Once the predicate is `val < x` and then the predicate is `val == x`. Except for silly mistakes I'm pretty sure that the algorithm above implements the correct logic (it isn't tested, though, and I generally have little faith in untested code). – Dietmar Kühl Dec 25 '12 at 19:55
Ok, I'm pretty confused now. I understand what You're saying (well, apart from some of the code, I guess), but then, what is this: http://stackoverflow.com/questions/7198121/quicksort-and-hoare-partition about? Their implementation is very similar to mine. – user4520 Dec 25 '12 at 20:02
@rici: Yes, there is 3-way partition but Hoare Partition isn't a 3-way partition. I didn't find a direct references for Hoare Partition and, thus, assume that the algorithms in "Introduction to Algorithms", 3rd Edition, Corman et al., page 185, is "Hoare Partition". The above algorithm is different in a subtle way (you'd need to negate the predicate to get Hoare Partition) but neither of these two algorithms sorts elements equal to the pivot together. – Dietmar Kühl Dec 25 '12 at 20:19
Exactly, this was my main problem, apart from the crash in some specific cases. So, is there a method, based on any of the above, which can help me reach my goal? – user4520 Dec 25 '12 at 20:26
@DietmarKühl, true, it's not Hoare's original algorithm. In the wikipedia code for DNF, replace `>= high` with `> low` to do the 3-way pivot. IIRC Sedgwick does it better for the expected case where there are only a few elements equal to the pivot. – rici Dec 25 '12 at 20:26
@user1928235: The main problem with your code is that doesn't make any progress if there are two pivot elements in the array: It keeps swapping them. This is primary reason the Hoare Partition uses `<=` to compare the elements. I don't think Hoare Partition algorithm partitions the sequence correctly if there are multiple copies of the pivot element in the sequence. – Dietmar Kühl Dec 25 '12 at 20:36
@user1928235: If you need to gather the elements equal to the pivot element together, you can use my partition algorithm twice: First use it with the predicate `val < pivot` giving you two subsequences: `[begin, middle)` and `[middle, end)`. All elements identical to the pivot element are in the second sequence, i.e, you'd then use the partition algorithm with the predicate `val == pivot` on `[middle, end)`. rici's reference points to a three way partition. It may be interesting to see whether the use of this algorithm is more efficient than two uses of a vanilla partition. – Dietmar Kühl Dec 25 '12 at 20:40
Indeed, this is what I've noticed, however I've thought there must be something wrong strictly with the implementation, not just the general idea. But as you said, <= is not the solution either. Can you recommend anything that would solve the problem? A modification of Hoare's algorithm, maybe? UPDATE: Well, seems like we've both posted in the same moment. Thanks, I'll look into it. If you just could, please, put it in a less sophisticated way (technically, I mean the coding aspect)? I still consider myself rather a beginner. – user4520 Dec 25 '12 at 20:41
@user1928235: The code? I wouldn't be able to get things straight using a different abstraction. I don't think this code is sophisticated, at all! It just uses trivial function calls and simple logic. Which part of the code is unclear? – Dietmar Kühl Dec 25 '12 at 20:53
This -> template ; and this -> typedef std::reverse_iterator RIt; I understand what the typedef keyword does in general, however, in this example... ? – user4520 Dec 25 '12 at 21:04
@user1928235: The algorithm can cope with fairly general sequences. Thus, it is made a template, keeping the types of the iterator and the predicate unspecified. They are deduced when the function is being called. You can probably use a fixed type instead of making `BiIt` a template argument (e.g., `std::vector::iterator`) but replacing the predicate with a fixed type would be somewhat of a mess, e.g., because `std::not1(pred)` creates a somewhat odd type. Just look-up templates. – Dietmar Kühl Dec 25 '12 at 21:17
@user1928235: `std::reverse_iterator` is just an iterator adaptor which reverses the direction in which the iterator moves: `std::find_if()` always searches from the beginning of a sequence to the end using `++it` to move an iterator to the next position. To move in the opposite direction, it would need to use `--it`. `std::reverse_iterator` does just this translation. This class template also arranges for a view of the reversed sequence which seems to start at `std::reverse_iterator(end)` although `*end` is illegal. – Dietmar Kühl Dec 25 '12 at 21:20
Ok, looks like I just need to look up some technical details to gain understanding. I'll look to it tomorrow, it's 10:30 pm in my time zone. Thank you very much; could you just explain this loop: for (Rit rit(end); ? – user4520 Dec 25 '12 at 21:35
@user1928235: Well, it should be `RIt` rather than `Rit`, of course (I'll fix the answer): It just declares a `std::reverse_iterator` called `rit` and initialized with the iterator `end` (the initialization won't matter because it will be overwritten by the results of one of the `std::find_if()`s). – Dietmar Kühl Dec 25 '12 at 21:39
Ok, thank you so much for wasting time with me, I'll write if I get something tomorrow :) – user4520 Dec 25 '12 at 22:38

Hoare partitioning algorithm - strict or not strict inequalities?

1 Answers1