1

I want to do binary search on reverse sorted lists in Python. The list is appended in reverse order, and it has to be like this or else my code will break.

I need the code to be as fast as possible, assume the list is huge. I know Python code is slow so the code has to be compiled. And I know the bisect module in the standard library imports precompiled _bisect module if found, _bisect is implemented in C which is very fast.

However it doesn't work on reverse sorted lists, I tried key = lambda x: -x and it didn't work:

In [51]: l = range(50, 0, -5)

In [52]: from bisect import bisect

In [53]: bisect(l, 18, key=lambda x: -x)
Out[53]: 10

I copied code from the file from the installation and modified it:

def bisect_right(a, x):
    lo, hi = 0, len(a)
    while lo < hi:
        mid = (lo + hi) // 2
        if x < a[mid]:
            hi = mid
        else:
            lo = mid + 1
    return lo

Change x < a[mid] to x > a[mid] will make it work on reverse sorted lists.

def reverse_bisect(a, x):
    lo, hi = 0, len(a)
    while lo < hi:
        mid = (lo + hi) // 2
        if x > a[mid]:
            hi = mid
        else:
            lo = mid + 1
    return lo

The uncompiled code is much slower than precompiled code, as expected. Also note I cannot compare the performance of reverse_bisect against _bisect.bisect because the latter doesn't work properly in this case.

In [55]: %timeit bisect(range(0, 10**7, 10), 4096)
2.91 µs ± 97.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [56]: %timeit bisect_right(range(0, 10**7, 10), 4096)
5.22 µs ± 87.9 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

So I tried to compile the functions using numba.jit, I added @numba.jit(nopython=True, fastmath=True, cache=True, forceobj=False) to the line before bisect_right, producing bisect_right_nb, but bisect_right_nb gives me a warning and is SIX ORDERS OF MAGNITUDE SLOWER:

In [59]: @numba.jit(nopython=True, fastmath=True, cache=True, forceobj=False)
    ...: def bisect_right_nb(a: list, x: int):
    ...:     lo, hi = 0, len(a)
    ...:     while lo < hi:
    ...:         mid = (lo + hi) // 2
    ...:         if x < a[mid]:
    ...:             hi = mid
    ...:         else:
    ...:             lo = mid + 1
    ...:     return lo

In [60]: l = list(range(0, 10**7, 10))

In [61]: %timeit bisect_right_nb(l, 4096)
C:\Python310\lib\site-packages\numba\core\ir_utils.py:2149: NumbaPendingDeprecationWarning:
Encountered the use of a type that is scheduled for deprecation: type 'reflected list' found for argument 'a' of function 'bisect_right_nb'.

For more information visit https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-reflection-for-list-and-set-types

File "<ipython-input-59-23a3cb61146c>", line 2:
@numba.jit(nopython=True, fastmath=True, cache=True, forceobj=False)
def bisect_right_nb(a: list, x: int):
^

  warnings.warn(NumbaPendingDeprecationWarning(msg, loc=loc))
1.66 s ± 11.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [62]: 1.66*10**6/5.22
Out[62]: 318007.66283524904

Just how can I improve the performance of reverse_bisect_right?


And no, I have read this answer, I am not trying to insert the element into the list, I am trying to get rid of all elements smaller than it.

(And my network connection is unstable, my ISP is disrupting my VPN connection, so my update took so long.)

Ξένη Γήινος
  • 2,181
  • 1
  • 9
  • 35
  • *"I tried key = lambda x: -x and it didn't work"* - Well you forgot to negate the search value (`18` in your example). – Kelly Bundy Jul 24 '23 at 16:33

1 Answers1

2

Don't ignore warnings

Encountered the use of a type that is scheduled for deprecation: type 'reflected list' found for argument 'a' of function 'bisect_right_nb'.

If we adress that and use a TypedList, then we get a nice speedup:

l = nb.typed.List(list(range(0, 10**7, 10)))
%timeit bisect_right_nb(l, 4096)
1.14 µs ± 1.21 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

compared to

l = list(range(0, 10**7, 10))
%timeit bisect_right_nb(l, 4096)
753 ms ± 38.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
FlyingTeller
  • 17,638
  • 3
  • 38
  • 53
  • How much time does the creation of the typed list take? (Btw it's not really called `TypedList`, is it?) – Kelly Bundy Jul 24 '23 at 16:40
  • The class is called `List`, in the `numba` docs it is referred to `typed-list` and it is implemented in `typedlist.py`. For the creation time: if I read the implementation correctly, it should be O(n), but I havn't benchamarked ot – FlyingTeller Jul 25 '23 at 05:42