Numba is running a lot slower than the raw Python

Question

I'm new to using Numba and keen to try and understand how it works.

I built a binary search function:

def binarysearch(alist, item):
        first = 0
        last = len(alist) - 1
        found = False

        while first<=last:
            midpoint = (first + last)//2            
            if alist[midpoint] == item:
                return midpoint
            else:
                if item < alist[midpoint]:
                    last = midpoint-1
                else:
                    first = midpoint+1  
        return -1

And if I run this as follows:

l = list(range(10000000))

%timeit
binarysearch(l,5000)

I get: 4.88 µs ± 315 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

If I do the same but use @numba.njit, I get: 19.8 s ± 533 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Any ideas for how to get the improvements of Numba?

versions: Python: 3.6.3 Numba: 0.40.0

That’s interesting. Have you tried a typed array instead of a list? Also maybe profile it by having it search for random numbers instead of always the same one. Could you count the number of iterations? 19s seems awfully slow for numba. — AlexNe, Jul 18 '20 at 14:34
Just tried: ``` import random from array import array l = array('i',range(10000000)) x = random.randint(0,10000000) ``` and got a time of: ```531 ns ± 12.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)``` and for the raw python: ```5.57 µs ± 282 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)``` So that seems to have worked... — Xander, Jul 18 '20 at 15:02
Any idea why lists don't work and array's do? Even with typing? — Xander, Jul 18 '20 at 15:10
Not really. In Python, list items can be of arbitrary type and size, it might be interesting on reading up on those: https://stackoverflow.com/questions/24171491/binary-layout-of-python-lists. Numba might have difficulties optimizing that. OTOH the memory address of each item in a numpy array/typed array/etc. is know before the loop even starts, so I’d assume this makes for way less overhead. — AlexNe, Jul 18 '20 at 15:18

Numba is running a lot slower than the raw Python

0 Answers0