Let's pick the definition of Big-O notation from the Wikipedia:
Big O notation is a mathematical notation that describes the limiting behavior of a function when the argument tends towards a particular value or infinity.
...
In computer science, big O notation is used to classify algorithms according to how their running time or space requirements grow as the input size grows.
So Big-O is similar to:

So when you are comparing two algorithms on the small ranges/numbers, you can't strongly rely on Big-O. Let's analyze the example:
We have two algorithms: the first is O(1) and works for exactly 10000 ticks and the second is O(n^2). So in the range 1~100 the second will be faster than the first (100^2 == 10000
so (x<100)^2 < 10000
). But from the 100 the second algorithm will be slower than the first.
The similar behaviour is in your functions. I timed them with various input lengths and constructed timing plots. Here is timings for your functions on big numbers (yellow is sort
, blue is heap
):

You can see that sort
is consuming more time than heap
, and the time is raising faster than heap's
. But if we will look closer on lower range:

We will see that on small range sort
is faster than heap
! Looks like heap
has "default" time consumation. So it is not wrong that algorithm with worse Big-O works faster than algorithm with better Big-O. It just means that their range usage is too small for better algorithm to be faster than the worse.
Here is the timing code for the first plot:
import timeit
import matplotlib.pyplot as plt
s = """
import heapq
def k_heap(points, K):
return heapq.nsmallest(K, points, key = lambda P: P[0]**2 + P[1]**2)
def k_sort(points, K):
points.sort(key = lambda P: P[0]**2 + P[1]**2)
return points[:K]
"""
random.seed(1)
points = [(random.random(), random.random()) for _ in range(1000000)]
r = list(range(11, 500000, 50000))
heap_times = []
sort_times = []
for i in r:
heap_times.append(timeit.timeit('k_heap({}, 10)'.format(points[:i]), setup=s, number=1))
sort_times.append(timeit.timeit('k_sort({}, 10)'.format(points[:i]), setup=s, number=1))
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
#plt.plot(left, 0, marker='.')
plt.plot(r, heap_times, marker='o')
plt.plot(r, sort_times, marker='D')
plt.show()
For second plot, replace:
r = list(range(11, 500000, 50000)) -> r = list(range(11, 200))
plt.plot(r, heap_times, marker='o') -> plt.plot(r, heap_times)
plt.plot(r, sort_times, marker='D') -> plt.plot(r, sort_times)