0

I'm trying to speed up a comparison between two pointclouds, I have some code which took up to an hour to complete. I've butchered it to this and tried to implement numba. The code works with the exception of the scipy cdist function. It's my first test of using numba, where am I going wrong?

from numba import jit

@jit(nopython=True)
def near_dist_top(T, B):
    xi = [i[0] for i in T]
    yi = [i[1] for i in T]
    zi = [i[2] for i in T]
    XB = B

    insert_params = []

    for i in range(len(T)):
        XA = [T[i]]
        disti = cdist(XA, XB, metric='euclidean').min()
        insert_params.append((xi[i], yi[i], zi[i], disti))
        # print("Top: " + str(i) + " of " + str(len(T)))
        print(i)
    return insert_params
    print(XB)

@@@ Edits @@@

Both T and B are lists of coordinates

(580992.507, 4275268.8321, 192.4599), (580992.507, 4275268.8391, 192.4209), (580992.507, 4275268.8391, 192.4209)

hmmm, does numba handle lists, does it need to be a numpy array, would cdist handle a numpy array...?

The error

numba.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Untyped global name 'cdist': cannot determine Numba type of <class 'function'>

File "scratch_28.py", line 132:
def near_dist_top(T, B):
    <source elided>
        XA = [T[i]]
        disti = cdist(XA, XB, metric='euclidean').min()
        ^
Spatial Digger
  • 1,883
  • 1
  • 19
  • 37
  • Can you please post the stack trace ? Also, what does the argument T and B contain ? – Gambit1614 Mar 09 '19 at 05:25
  • What sor t of error? LIke something about not being able to implement `cdist`? How, by the way, are you importing or defining `cdist`? It's not 'native' `numpy`. – hpaulj Mar 09 '19 at 07:03
  • ok, I've updated the question – Spatial Digger Mar 09 '19 at 08:54
  • cdist is brought in using `from scipy.spatial.distance import cdist` – Spatial Digger Mar 09 '19 at 08:55
  • Is `cdist` in the list of of functions that can be handled with 'nopython'? – hpaulj Mar 09 '19 at 15:20
  • I'm assuming that's the issue, although I converted the nested list to a numpy array and I saw a x10 speed improvement (from 30 minutes to 3 minutes) and it looks like its invoking the GPU (my GPU active light turns on). So the question is: is scipy within numba? Any idea where the numba list is? – Spatial Digger Mar 09 '19 at 18:54
  • Here's the list, cdist doesn't seem to be there, min() is though so the speed improvement might be related to that (although I doubt it)? https://numba.pydata.org/numba-doc/dev/reference/numpysupported.html – Spatial Digger Mar 09 '19 at 18:57
  • `numba` seeks to implement a large part of `numpy`. `scipy` is a collection of addons to `numpy` - generally scientific and math that uses `numpy`, but it is not tightly integrate with `numpy` or with itself. So while `scipy` code uses `numpy` code, you should not assume that `numpy` itself is aware of `scipy` distinctives - and that would apply to `numba` as well. – hpaulj Mar 10 '19 at 01:12
  • so I assume cdist is not compatible with numba? – Spatial Digger Mar 10 '19 at 08:50
  • How large are the lists you are working on? It looks like you are searching for a nearest neighbour search. In this case a kdtree solution https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.cKDTree.html will outperform the best brute-force Numba solution on larger arrays. Apart from that there are several possibilities to implement cdist to numba eg. https://stackoverflow.com/a/53380192/4045774 – max9111 Mar 11 '19 at 08:47

0 Answers0