I've compared the performances of some built-in sort functions in python 3 versus some algorithms that I know their performances, and I saw some unexpected results. I would like to clarify those issues.
I used the "perfplot" library to compare the complexity of the algorithms visualized.
import numpy as np
import perfplot
# algorithms:
def sort(x):
np.sort(x)
return 0
def sort_c(x):
np.sort_complex(x)
np.sort_complex(x)
return 0
def builtin(x):
sorted(x)
return 0
def c_linear_inplace(x):
for i in range(len(x) - 1):
if x[i] > x[i + 1]:
x[i] = x[i] + x[i + 1]
x[i + 1] = x[i] - x[i + 1]
x[i] = x[i] - x[i + 1]
return 0
def c_linear_outplace(x):
a = x.copy()
for i in range(len(x) - 1):
if x[i] > x[i + 1]:
a[i] = x[i + 1]
a[i + 1] = x[i]
x = a.copy()
return 0
def c_nlogn(x):
logn = int(np.ceil(np.log2(len(x))))
for i in range(len(x)-1):
for j in range(logn):
x[i] = 0
return 0
#comprasion:
perfplot.show(
setup=np.random.rand, # function to generate input for kernel by n
kernels=[
sort,
sort_c,
builtin,
c_linear_inplace,
c_linear_outplace,
c_nlogn,
],
labels=["sort", "sort_c", "builtin", "check: linear inplace", "check: linear not in place", "check: nlogn"],
n_range=[2 ** k for k in range(15)], # list of sizes of inputs, i"setup" function will be called with those values
xlabel="len(a)",
)
I expect all sort functions to be close to the nlogn() function or at least less efficient than the linear ones, and I expect "c_linear_inplace" would be faster than "c_linear_outplace". but all built-in sort functions were much faster than the linear functions and the "c_linear_outplace" function was slower than the "c_linear_inplace" function.
EDIT:
as I see it, those are the complexities of the functions with constants:
sort, sort_c, builtin : cnlog2(n) for c>=1
check: linear inplace : 6n
check: linear not in place : 7n
check: nlogn : 2n + 3nlog2n
I calculated any for loop as 2*(number of iterations) for the check and increment. and any "if" as O(1) and any assignment (=) as O(1) and it seems odd that "check: linear not in place" that even takes more memory has much better performances than "check: linear inplace" and still worse than any of numpy's sorts