1

Given two 2D numpy arrays containing x and y coordinates, how can I find identical pairs in another array with identical dimensions?

For example, I have these arrays:

array([[ 2,  1,  3,  4],
       [ 4,  3,  5, 10]])

and

array([[ 0,  2,  3,  4],
       [ 3,  4, 11, 10]])

I would expect to find that the pairs (2, 4) and (4, 10) would be detected as existing in both arrays.

Thanks very much in advance!

Louis Thibault
  • 20,240
  • 25
  • 83
  • 152

3 Answers3

7

Try this:

>>> a2 = [[ 0,  2,  3,  4],
   [ 3,  4, 11, 10]]
>>> a1 = [[ 2,  1,  3,  4],
   [ 4,  3,  5, 10]]
>>> set(zip(*a1)) & set(zip(*a2))
{(4, 10), (2, 4)}

You could traslate the array to list by array.tolist()

For any 2D array, to say, the first row represents the X-axis, and the second the Y-axis. So zip(*a1) would result in all coordinate pairs. Then the set() constructor will filter out all the duplicate records. And finally, the & operation between two set would figure out all the coordinate pairs, in both two arrays.

Hope it helps!

Sheng
  • 3,467
  • 1
  • 17
  • 21
  • @heltonbiker I do not think so. Could you please give me some counter example? I update some explaination about it. – Sheng Apr 19 '13 at 00:53
  • Actually it would be interesting if the OP had something to say about it... – heltonbiker Apr 19 '13 at 01:56
  • @heltonbiker Sorry. Who is OP? And there are two: (2, 4) and (4, 10). I tried my solution on your test case. It works. – Sheng Apr 19 '13 at 02:16
  • OP means "original poster"; in this case, blz. – Kyle Strand Apr 19 '13 at 02:17
  • Also, Sheng, this solution is great. – Kyle Strand Apr 19 '13 at 02:17
  • @KyleStrand Thank you! I am a newbie here. So some dialect seems unfamiliar to me. – Sheng Apr 19 '13 at 02:19
  • I think OP might have originally been a reddit thing, actually. Not sure though. – Kyle Strand Apr 19 '13 at 02:22
  • That indeed worked with different sample imputs. I'm removing previous comments and studying your solution! – heltonbiker Apr 19 '13 at 02:24
  • @Sheng also, in this case you can apply zip to the `ndarrays` directly, because when you iterate over an array, or in this case unpack (`*ndarray`), the array is iterated along first dimension. By the way, a solid alternative to `numpy.split`! – heltonbiker Apr 19 '13 at 02:29
  • @heltonbiker I am not quite familiar with numpy. Could you please tell me how to use numpy.split here? In the official documents, this function is said to **split an array into multiple sub-arrays**, not extract pairs. – Sheng Apr 19 '13 at 02:59
  • In our context here, unpacking an array, or even doing `for stuff in array` will see the array as a list of lists. If I take a 2D array, I can use `for row in array`, because it will iterate over the first dimension returning the rows. But this is because arrays CAN be indexed (they implement the `__getitem__` method that is called by the index `[]` operator), but (multidimensional) arrays are intended to be SLICED. If I wanted "for column in array", I could not do it like this (columns are not the first dimension), and instead I would use `for column in numpy.split(array, array.shape[1])`. – heltonbiker Apr 19 '13 at 04:52
  • @heltonbiker I really appreciate your instructions. Thanks! Learn some thing. – Sheng Apr 19 '13 at 05:02
3

The numpythonic way of doing this would be as follows:

>>> a1 = np.array([[2, 1, 3, 4], [4, 3, 5, 10]])
>>> a2 = np.array([[0, 2, 3, 4], [3, 4, 11, 10]])
>>> a1 = a1.T.copy().view([('', a1.dtype)]*2)
>>> a2 = a2.T.copy().view([('', a2.dtype)]*2)
>>> np.intersect1d(a1, a2)
array([(2, 4), (4, 10)], 
      dtype=[('f0', '<i4'), ('f1', '<i4')])
Jaime
  • 65,696
  • 17
  • 124
  • 159
0

A direct solution would be:

import numpy

array1 = numpy.array([[ 1, 99, 2, 400],
                      [ 3, 98, 4, 401]])

array2 = numpy.array([[ 1,  6, 99,   7],
                      [ 8,  9, 98, 401]])

result = []
for column_1 in xrange(array1.shape[1]):
    for column_2 in xrange(array2.shape[1]):
        if numpy.array_equal(array1[:,column_1], array2[:,column_2]):
            result.append(array1[:,column_1])

print numpy.array(result).transpose()

[[99]
 [98]]
heltonbiker
  • 26,657
  • 28
  • 137
  • 252