I wrote two methods to sort a list of numbers, they have the same time complexity: O(n^2), but the actually running time is 3 times difference ( the second method uses 3 times as much time as the first one ).
My guess is the difference comes from the memory hierarchy ( count of register/cache/memory fetches are quite different ), is it correct?
To be specific: the 1st method compare one list element with a variable and assign value between them, the 2nd method compares two list elements and assign values between them. I guess this means the 2nd method has much more cache/memory fetches than the 1st one. Right?
When list has 10000 elements, the loop count and running time are as below:
# TakeSmallestRecursivelyToSort Loop Count: 50004999
# TakeSmallestRecursivelyToSort Time: 7861.999988555908 ms
# CompareOneThenMoveUntilSort Loop Count: 49995000
# CompareOneThenMoveUntilSort Time: 17115.999937057495 ms
This is the code:
# first method
def TakeSmallestRecursivelyToSort(input_list: list) -> list:
"""In-place sorting, find smallest element and swap."""
count = 0
for i in range(len(input_list)):
#s_index = Find_smallest(input_list[i:]) # s_index is relative to i
if len(input_list[i:]) == 0:
raise ValueError
if len(input_list[i:]) == 1:
break
index = 0
smallest = input_list[i:][0]
for e_index, j in enumerate(input_list[i:]):
count += 1
if j < smallest:
index = e_index
smallest = j
s_index = index
input_list[i], input_list[s_index + i] = input_list[s_index + i], input_list[i]
print('TakeSmallestRecursivelyToSort Count', count)
return input_list
# second method
def CompareOneThenMoveUntilSort(input_list: list) -> list:
count = 0
for i in range(len(input_list)):
for j in range(len(input_list) - i - 1):
count += 1
if input_list[j] > input_list[j+1]:
input_list[j], input_list[j+1] = input_list[j+1], input_list[j]
print('CompareOneThenMoveUntilSort Count', count)
return input_list