If I have a list
and need to sort it, is there a good reason to use list.sort()
over heapq.heapify(list)
, given that heapify
is O(N)
(link) and .sort()
is O(NlogN)
?
Why isn't the default sort algorithm heapify if heapify is faster?
If I have a list
and need to sort it, is there a good reason to use list.sort()
over heapq.heapify(list)
, given that heapify
is O(N)
(link) and .sort()
is O(NlogN)
?
Why isn't the default sort algorithm heapify if heapify is faster?
as guys have said in the comment section, "heapify" rearranges elements of the list to form a heap-oriented binary tree. complete binary trees can be represented using arrays or in other words, any array which is fully filled with comparable values can be logically viewed as a complete binary tree.
by definition from wikipedia, complete binary tree is:
A complete binary tree is a binary tree in which every level, except possibly the last, is completely filled, and all nodes in the last level are as far left as possible.
see an example:
a = [5, 3, 8, 2, 4, 6, 10]
---> heapq.heapify(a)
to form a heap you need to satisfy 2 rules:
that's exactly what heapify does, it rearranges elements in the array to satisfy those 2 conditions.
as you can see in the picture above, just by taking the top element (root) each time, you'll access tree elements in a sorted fashion or in other words you can use heaps to sort your array in n*logn time:
import heapq
a = [5, 3, 8, 2, 4, 6, 10]
def sort_with_heap(a):
"""
sort O(nlogn)
"""
heapq.heapify(a) # heapify's the list O(n)
while len(a) > 0: # O(n) times
yield heapq.heappop(a) # O(logn)
print(list(sort_with_heap(a)))
>> [2, 3, 4, 5, 6, 8, 10]
note: that the example above is for the sake of understanding how heaps can be used for sorting and is not space optimal. to do proper sorting using heaps take a look at heapsort (which works with max-oriented heaps) and doesn't use any additional space than the sorting array itself. i've wrote one myself in here.