The traditional Heapsort
algorithm swaps the last element of the heap with the root of the current heap after every heapification
, then continues the process again. However, I noticed that it is kind of unnecessary.
After a heapification
of a sub-array, while the node contains the highest value (if it's a max-heap
), the next 2 elements in the array must follow the root in the sorted array, either in the same order as they are now, or exchanging them if they are reverse-sorted. So instead of just swapping the root with the last element, won't it be better to swap the first 3 elements (including the node and after the if necessary exchange of the 2nd and 3rd elements) with the last 3 elements, so that 2 subsequent heapifications
(for the 2nd and 3rd elements ) are dispensed with?
Is there any disadvantage with this method (apart from the if-needed swapping of the 2nd and 3rd elements, which should be trivial)? If not, if it is indeed better, how much performance boost will it give? Here is the pseudo-code:
function heapify(a,i) {
#assumes node i has two nodes, both heaps. However node i itself might not be a heap, i.e one of its children may be greater than its value.
#if so, find the greater of its two children, then swp the parent with that value.
#this might render that child as no longer a heap, so recurse
}
function create_heap(a) {
#all elements following index |_n/2_| are leaf nodes, thus heapify() should be applied to all elements within index 1 to |_n/2_|
}
function heapsort(a) {
create_heap(a); #a is now a max-heap
#root of the heap, that is a[1] is the maximum, so swap it with a[n].
#The root now contains an element smaller than its children, both of which are themselves heaps.
#so apply heapify(a,1). Note: heap length is now n-1, as a[n] is the largest element already found
#now again the array is a heap. The highest value is in a[1]. Swap it with a[n-1].
#continue
}
Suppose the array is [4,1,3,2,16,9,10,14,8,7]
. After running a heapify
, it will become [16,14,10,8,7,9,3,2,4]
. Now the heapsort
's first iteration will swap 16 and 4, leading to [4,14,10,8,7,9,3,2,16]
. Since this has now rendered the root of the new heap [4,14,10,8,7,9,3,2]
as, umm, un-heaped, (14 and 10 both being greater than 4), run another heapify
to produce [14,8,10,4,7,9,3,2]
. Now 14 being the root, swap it with 2 to yield [2,8,10,4,7,9,3,14]
, thus making the array currently [2,8,10,4,7,9,3,14,16]
. Again we find that 2 is un-heaped, so again doing a heapify
makes the heap as [10,8,9,4,7,2,3]
. Then 10 is swapped with 3, making the array as [3,8,9,4,7,2,3,10,14,16]
. My point is that instead of doing the 2nd and 3rd heapification
s to store 10 and 14 before 16, we can tell from the first heapification
that because 10 and 14 follow 16, they are the 2nd and 3rd largest elements (or vice-versa). So after a comparison between them (in case they are already sorted, 14 comes before 10), I swap all the there (16,14,10)
with (3,2,4)
, making the array [3,2,4,8,7,9,16,14,10]
. This reduces us to a similar condition as the one after the further two heapification
s - [3,8,9,4,7,2,3,10,14,16]
originally, as compared to [3,2,4,8,7,9,16,14,10]
now. Both will now need further heapification
, but the 2nd method has let us arrive at this juncture directly by just a comparison between two elements (14 and 10).