Calculating average time for sort methods

Question

In my test class Ordena, I'm trying to calculate the average time for the execution of 30 repeats for some sort methods, for example, insertion sort:

@staticmethod
def insertionSort(v):
    for i in range(len(v)-1):
        temp = v[i]
        j = i-1
        while j >=0 and temp < v[j]:
            v[j+1] = v[j]
            j = j-1
        v[j + 1] = temp

My timing and averaging function looks like this:

@staticmethod
def average(x, function, repeat=30):
    total_time = 0
    sequence = Ordena.getList(x)
    print(sequence)
    for n in range(repeat):
        time_start = time.time()
        print(time_start)
        assert function(sequence)
        total_time += time.time()+time_start
        print(total_time)

I'm doing the implementation myself (without using the timeit module) for academic practice.

Now when I call the average function like Ordena.average(x, Ordena.insertionSort), I'm getting the following error:

  File "<input>", line 1, in <module>
  File "C:/Users/sandr/PycharmProjects/Trabalho 3/main.py", line 140, in average
    assert function(sequence)
AssertionError

What am I doing wrong here?

What is `function`? `insertionSort`? If it is, it returns `None` and the `assert` fails. — Klaus D., Mar 14 '21 at 17:52
@SandraSilva I incorporated the information from the comments, does the result reflect your intention correctly? — Wolf, Mar 14 '21 at 21:38

Wolf · Accepted Answer · 2021-03-15T09:08:55.690

The line assert function(sequence) causes that the sort function is invoked, consuming the time you want to measure, and, after that, checks if the function result is something that evaluates to True (that's what assert is for). Since you are calling it for your sort function (insertionSort) which implicitly returns None, the result will be evaluated to False hence the error. In order to measure how much time a function consumes, you just call it and you'll be fine.

So, by removing assert before function(sequence) the AssertionError will be gone.

But there are still other issues with your approach:

you include the time for printing the start_time
you include the time for adding to total_time
you sorting the sequence with the first call and working on a sorted sequence later
you don't actually average

Possible approach

Something like this will probably give you better results (class stuff stripped for simplicity):

import random
import time

def getSampleList(x):
    return random.sample(range(x), x)

def insertionSort(v):
    for i in range(len(v)-1):
        temp = v[i]
        j = i-1
        while j >=0 and temp < v[j]:
            v[j+1] = v[j]
            j = j-1
        v[j + 1] = temp

def average(x, function, repeat=30):
    # get the shuffled sample list once
    mastersequence = getSampleList(x)
    # create repeat copies of this sequence
    sequences = [mastersequence[:] for i in range(repeat)]
    time_start = time.time()
    for sequence in sequences:
        function(sequence)
    return (time.time() - time_start)/repeat
        
print(average(100, insertionSort, 10_000))

Refinement

After some helpful comments by Kelly Bundy ^{thanks by the way}, I figured out that the high memory consumption (I already was aware of) isn't the only problem with this approach. An important problem is that its result heavily depends on just one random shuffle. For some algorithms, the sorting speed is closely coupled to how sorted the input is, think for example of Bubblesort on sorted input. So let's look into average2, a variation of average (which is also closer to the function in the question):

def average2(x, function, repeat=30):
    timeConsumption = 0
    for _ in range(repeat):
        sequence = getSampleList(x)
        time_start = time.time()
        function(sequence)
        timeConsumption += time.time() - time_start
    return timeConsumption/repeat

print('avarage:')
for i in range(10):
    print(f"{average(100, insertionSort, 10_000):.7f}")
    
print('avarage2:')
for i in range(10):
    print(f"{average2(100, insertionSort, 10_000):.7f}")

See the results for 10 runs of average and avarage2 (rounded to 7 digits):

Here it gets obvious that the variance of the results of the second measurement is lower, so I would trust it more.

@KellyBundy Fixed. Thanks so much, you were absolutely correct about this :) ...and this made exactly the difference in timing I talked about ;) — Wolf, Mar 14 '21 at 22:46
Btw if you're willing to spend that much memory for the copies, it might be better to instead create different shuffles. Takes more time than copying, but compared to the sorting it's still negligible. Though in reality I'd rather do it more like the OP, but create a new shuffle inside the loop before taking the start time. — Kelly Bundy, Mar 14 '21 at 22:54
@KellyBundy you are probably right. The problem is, that my surrogate implementation of `Ordena.getList` is based on assumptions. — Wolf, Mar 14 '21 at 23:04
@KellyBundy As to measure sorting algorithms, you need to subtract the time for copying, or shuffling, etc. maybe it would be better to subtract the avg time of shuffling from the avg time of shuffling plus sorting to get avg time of sorting. — Wolf, Mar 14 '21 at 23:08
You don't need to subtract it if you don't include it in the first place :-). Also, the time `(shuffle1+sort) - shuffle2` differs from the time `sort` by `shuffle1 - shuffle2`, so that would add an inaccuracy. — Kelly Bundy, Mar 14 '21 at 23:37
@KellyBundy Your last comment made me think again of my solution attempt, and I saw that it has an important flaw. So I added an alternative that seems close to your suggestion. We know that performance measurements on sorting algorithms depend heavily on the nature of the input (think of the unbeatable speed of Bubblesort on sorted input). This can also be seen in my sample outputs. — Wolf, Mar 15 '21 at 07:59
Yes, that's how I imagined it, although now that you're measuring time differences of less than a millisecond, you'd really better use `time.perf_counter` instead of `time.time`. For example on my Windows PC, `time` has 1 millisecond resolution while `perf_counter` has 0.1 microsecond resolution (10000 times finer). See [experiment](https://pastebin.pl/view/raw/2b10b9df). — Kelly Bundy, Mar 15 '21 at 16:14
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/229944/discussion-between-wolf-and-kelly-bundy). — Wolf, Mar 15 '21 at 16:21
@Wolf Ordena is a class created to implement the sort methods: insertion sort, merge sort, etc. — Sandra Silva, Mar 15 '21 at 21:10

Calculating average time for sort methods

1 Answers1

Possible approach

Refinement