I was implementing and testing answers to this SO question -
Given an array of integers find the number of all ordered pairs of elements in the array whose sum lies in a given range [a,b]
The answer with the most upvotes (currently) only provides a text description of an algorithm that should be O(NlogN):
Sort the array... . For each element x in the array: Consider the array slice after the element. Do a binary search on this array slice for [a - x], call it y0. If no exact match is found, consider the closest match bigger than [a - x] as y0. Output all elements (x, y) from y0 forwards as long as x + y <= b. ... If you only need to count the number of pairs, you can do it in O(nlogn). Modify the above algorithm so [b - x] (or the next smaller element) is also searched for.
My implementation:
import bisect
def ani(arr, a, b):
# Sort the array (say in increasing order).
arr.sort()
count = 0
for ndx, x in enumerate(arr):
# Consider the array slice after the element
after = arr[ndx+1:]
# Do a binary search on this array slice for [a - x], call it y0
lower = a - x
y0 = bisect.bisect_left(after, lower)
# If you only need to count the number of pairs
# Modify the ... algorithm so [b - x] ... is also searched for
upper = b - x
y1 = bisect.bisect_right(after, upper)
count += y1 - y0
return count
When I plot Time versus N or some function of N I am seeing an exponential or N^2 response.
# generate timings
T = list() # run-times
N = range(100, 10001, 100) # N
arr = [random.randint(-10, 10) for _ in xrange(1000000)]
print 'start'
start = time.time()
for n in N:
arr1 = arr[:n]
t = Timer('ani(arr1, 5, 16)', 'from __main__ import arr1, ani')
timing_loops = 100
T.append(t.timeit(timing_loops) / timing_loops)
Is my implementation incorrect or is the author's claim incorrect?
Here are some plots of the data.
T vs N
T / NlogN vs N - one commenter thought this should NOT produce a linear plot - but it does.
T vs NlogN - I thought this should be linear if the complexity is NlogN but it is not.