-1

I was looking at different sorting algorithms and their performance (link) and then I tried to implement some sorting algorithms myself. I wanted to improve them as well and so, as I was coding the insertion sort, I thought why not to use binary search, as the first part of array is already sorted, and in order to get rid of swaps, use an additional array. The code could be found on my GitHub or here:

def insertion_sort_optimized(arr):
    array = [arr[0]]
    for i in range(1, len(arr)):
        index = bisect_left(array, i)
        array.insert(index, i)
    return array

And then I implememented Quicksort as usually (link):

def quicksort(arr):
    if len(arr) <= 1:
        return arr
    smaller_part = []
    larger_part = []
    pivot = arr[len(arr) - 1]
    for i in range(len(arr) - 1):
        if arr[i] < pivot:
            smaller_part.append(arr[i])
        else:
            larger_part.append(arr[i])
    return quicksort(smaller_part) + [pivot] + quicksort(larger_part)

And then I generated a random array of 1.000.000 numbers and compared the performance of both of them, using this helper function:

def test_sorting_algorithm(sort=None, version="normal", returns=False, n=1000000):
    if version.lower() == "normal":
        print("Normal version:")
    else:
        print("Optimized version:")
    arr = [int(random() * n) for _ in range(n)]
    print(arr)

    start = time.time()
    if returns:
        arr = sort(arr)
    else:
        sort(arr)
    end = time.time()

    print(arr)
    print(f"Time elapsed: {end - start}\n")

So it basically runs the given sort function and prints how much time did it take to sort the array. And so I ran this code for at least 10 times and the binary insertion sort was almost 9 times faster (9s > 1s) each time. But I thought that the Quicksort was the fastest... And if I compare those 2 sorting algorithms, I would say that binary insertion sort is better, although it takes O(n) additional space (worst time complexity would be O(n*log(n)) which is better than Quicksort...). Is it a mistake? Is Quicksort actually worse than binary insertion sort? I tried to find it all over the Internet but couldn't find something really similar to my code. Maybe it is not even binary insertion sort, but rather something else...(another name)?

Ilya Maier
  • 475
  • 3
  • 12
  • Your quicksort isn't efficiently implemented, you've made it quadratic time: `quicksort(smaller_part) + [pivot] + quicksort(larger_part)` Also, in quicksort, the choice of pivot is crucial, and you are using the most naive strategy. Finally note, quicksort is not the most widely used. Python, for example, uses a highly tuned adaptive mergesort, Timsort. So does Java now, be default (adopting Timsort). – juanpa.arrivillaga Apr 16 '20 at 20:30
  • @HeapOverflow what do you mean? I don't get your point... How could I improve it? – Ilya Maier Apr 16 '20 at 20:31
  • Your quicksort looks like it is re-making the contents, whereas it was envisaged as an in-place sort. Perhaps the time is taken making all those copies. – quamrana Apr 16 '20 at 20:31
  • But shouldn't be Quicksort still be the fastest? In comparison to Timsort and adaptive mergesort? – Ilya Maier Apr 16 '20 at 20:33
  • @quamrana why is it remaking the contents? and how could quicksort be an in-place sort? it is based on divide and conquer, isn't it? – Ilya Maier Apr 16 '20 at 20:34
  • @IlyaMaier no, it shouldn't. Why do you think it should? – juanpa.arrivillaga Apr 16 '20 at 20:35
  • For example `smaller_part.append(arr[i])` is creating a new list one element at a time from the parameter. Also `quicksort(smaller_part) + [pivot] + quicksort(larger_part)` creates yet another list. – quamrana Apr 16 '20 at 20:37
  • @juanpa.arrivillaga I was looking at different sites and almost all of them were saying that quicksort is the fastest. for example [here](https://www.toptal.com/developers/sorting-algorithms) – Ilya Maier Apr 16 '20 at 20:37
  • @quamrana i know that. I guess I should try make it with swaps and see what happens – Ilya Maier Apr 16 '20 at 20:40
  • Here is an [answer](https://stackoverflow.com/a/1909374/4834) where I give an incomplete answer to a question about Big-O notation. Note that if an algorithm is badly implemented and uses a large `k`, then an inferior algorithm may out-perform it in real-world tests, – quamrana Apr 16 '20 at 20:40
  • @quamrana but I know what Big O is... – Ilya Maier Apr 16 '20 at 20:42
  • 2
    But your comments lead me to suspect that you don't know about Big-O – quamrana Apr 16 '20 at 20:44

2 Answers2

0

Let's look at your attempt to write insertion sort:

def insertion_sort_optimized(arr):
    array = [arr[0]]
    for i in range(1, len(arr)):
        index = bisect_left(array, i)
        array.insert(index, i)
    return array

You're not inserting the array values, you're inserting the indexes. In increasing order. So this is wrong and it's O(n log n), instead of the O(n^2) that a correct version would take (due to the linear time for each insert).

Kelly Bundy
  • 23,480
  • 7
  • 29
  • 65
0

This is my Binary Insertion Sort Time complexity is (N^2)/4 + Nlog2(N/(2e))

def BinaryIndexing(arr: Union[Iterable, List, Tuple, np.ndarray], value, reverse: bool = False):
    if value < arr[0]:
        return 0
    elif value > arr[-1]:
        return len(arr)

    start, end = 0, len(arr)
    while True:
        if end - start <= 1:
            if reverse is False:
                if arr[start] > value:
                    return start
                else:
                    return start + 1
            else:
                if arr[start] < value:
                    return start
                else:
                    return start + 1

        mid = (start + end) // 2

        if reverse is False:
            if value < arr[mid]:
                end = mid
            else:
                start = mid
        else:
            if value > arr[mid]:
                end = mid
            else:
                start = mid


def BinaryInsertionSortOptimized(array: Union[Iterable, List, Tuple, np.ndarray], reverse: bool = False):
    if reverse is False:
        for index in range(1, len(array)):
            if array[index - 1] < array[index]:
                continue
            i = BinaryIndexing(arr=array[0:index], value=array[index], reverse=reverse)
            key_point = array[index]
            array[i + 1:index + 1] = array[i:index]
            array[i] = key_point

    else:
        for index in range(1, len(array)):
            if array[index - 1] > array[index]:
                continue
            i = BinaryIndexing(arr=array[0:index], value=array[index], reverse=reverse)
            key_point = array[index]
            array[i + 1:index + 1] = array[i:index]
            array[i] = key_point
    return array