Well, there's a good chance that bisect_left
is an O(logN)
operation (binary search) so your overall operation would be O(KlogN) where N relates to size of l, and K relates to size of m
.
Were the second list m
sorted as well, you could make this an O(N)
operation simply by running an index through both lists concurrently.
But, re your "I suspect this is slow" comment, your first move should always be to test the easiest solution with the largest expected data set. If that's workable, stop right there! Only if it's deficient do you start thinking about optimisation.
For example, consider the following program:
import random
import bisect
haystack = []
for _ in range(1000000):
haystack.append(random.random())
haystack.sort()
needles = []
for _ in range(10000):
needles.append(random.random())
result = [bisect.bisect_left(haystack, needle) for needle in needles]
print(result)
This creates a 1,000,000-element haystack and a 10,000-element needle list then uses your bisect
-ing list comprehension to do the work. Running that on my (not particularly grunty) desktop with time
shows:
real 0m0.738s # < 3/4 of a second elapsed
user 0m0.578s
sys 0m0.109s
And this includes the time taken to construct the lists, sort the big one, and print out the results.
Using timeit
to get rid of all that setup time can be done with:
import timeit
import random
import bisect
haystack = []
for _ in range(1000000):
haystack.append(random.random())
haystack.sort()
needles = []
for _ in range(10000):
needles.append(random.random())
print(timeit.timeit('[bisect.bisect_left(haystack, needle) for needle in needles]', setup = 'from __main__ import bisect, haystack, needles', number = 1000))
The output of that is 12.27
for the thousand iterations, which means you could do it about 75 times per second without breaking a sweat.