For the sake of the argument, consider following (very bad) sorting algorithm in python:
def so(ar):
while True:
le = len(ar)
switch = False
for y in range(le):
if y+1 == le:
break
if ar[y] > ar[y+1]:
ar[y],ar[y+1] = ar[y+1],ar[y]
switch = True
if switch == False:
break
return ar
I'm trying to understand the concept of "complexity of the algorithm" and there is one thing I don't get.
I came across the post that explains how to find the complexity of the algorithm here:
You add up how many machine instructions it will execute as a function of the size of its input, and then simplify the expression to the largest (when N is very large) term and can include any simplifying constant factor.
But well, the problem is, that I cannot calculate how many machine instructions will be executed just by knowing the length of the list.
Consider first example:
li = [random.randint(1,5000) for x in range(3000)]
start = time.time()
so(li)
end = time.time() - start
print(end)
Output: 2.96921706199646
Now have a look at the second example:
ok = [5000,43000,232] + [x for x in range(2997)]
start = time.time()
so(ok)
end = time.time() - start
print(end)
Output: 0.010689020156860352
We can see that the same sorting algorithm, two different lists, lists are the same length, and two completely different execution times.
When people are talking about algorithm complexity (big O notation) they normally assume that the only variable that determines complexity of the algo is the size of the input, but clearly, in the example above it is not the case. It is not only the size of the list, but also the positioning of each value within the list that determines the speed of the sorting.
So my question is, why do we only consider size of input when estimating complexity? And, if it is possible, can you tell me what the complexity of the algorithm above will be?