0

With numpy I am using the below algorithm to find the index of an element of a list in a separate list

    IndexList = np.zeros(len(x))

    for i in range(len(x)):

        positionx = np.where(cx == x[i])
        positiony = np.where(cy == y[i])

        Index = np.intersect1d(positionx, positiony)

        IndexList[i] = Index

This in itself is quite fast but I would like to know if there is a faster way to achieve the same purpose. Is there a better module to do this with than numpy? or maybe some other numpy functions that make this process faster? Can this kind of snippet be made faster with a pythonic approach or with comprehensions?

Ultimately I want to see whether the matrix that contains cx and cy have certain (x,y) coordinate pair that match the current x y pair from the two lists.

Example: cx, cy x, y are 1D numpy arrays

    cx cy        x  y 
    45 30        20 10
    20 10        19 13
    44 53      
    19 13

In this case indexList = [1, 3]

bcsta
  • 1,963
  • 3
  • 22
  • 61
  • 3
    It is not entire clear to me what you want to achieve? So you are looking if the array contains a certain coordinate? – Willem Van Onsem Jul 10 '18 at 19:25
  • @WillemVanOnsem I am looking to see whether the matrix that contains cx and cy have certain (x,y) coordinate pair that match the current x y pair from the two lists. Sorry if that was not clear. – bcsta Jul 10 '18 at 19:27
  • could you post some sample data & expected output please. that would clarify the question better than your description – Haleemur Ali Jul 10 '18 at 19:28
  • @AlexReynolds x and y are not the same length as cx and cy. sorting to what order? x and y are coordinate pairs. maybe I did not quite understand your suggestion. Maybe an actual answer will be better. – bcsta Jul 10 '18 at 19:34
  • Can you post your data please – Rushabh Mehta Jul 10 '18 at 19:35
  • @RushabhMehta I added a very simple but clear example in the question – bcsta Jul 10 '18 at 19:36
  • Would it always have exactly two columns? – Divakar Jul 10 '18 at 19:37
  • It's really not as clear as you think. What are the data structures you are using? What are the contents. If there are multiple locations of overlaps, what output do you want? This is quite unclear – Rushabh Mehta Jul 10 '18 at 19:37
  • @RushabhMehta these are all 1D numpy arrays whos contents are coordinates. The data is such that there are no multiple locations of overlap. intersect 1d will always be of length 1. I want a list of indices where xy match cx cy – bcsta Jul 10 '18 at 19:40

1 Answers1

2

Check which of the pairs in cx & cy equal the pairs in x & y

mask = (cx == x[:, None]) & (cy == y[:, None])

to get index of elements in cx & cy present in x & y use

np.vstack(np.where(mask))[1]
# outputs: array([1, 3], dtype=int64)

to get index of elements in x & y present in cx & cy use

np.vstack(np.where(mask))[0]
# outputs: array([0, 1], dtype=int64)

benchmarking code:

%timeit op(cx, cy, x, y)
44.6 µs ± 616 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit hal(cx, cy, x, y)
8.57 µs ± 90.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

speed up of 5.2x with sample data

# test methods
def op(cx, cy, x, y):
    IndexList = np.zeros(len(x))
    for i in range(len(x)):
        px = np.where(cx == x[i])
        py = np.where(cy == y[i])
        Index = np.intersect1d(px, py)
        IndexList[i] = Index
    return IndexList

def hal(cx, cy, x, y):
    mask = ( cx == x[:, None] ) & ( cy == y[:, None] )
    return np.vstack(np.where(mask))[1]
Haleemur Ali
  • 26,718
  • 5
  • 61
  • 85