0

Before posting this question, I have searched to find a solution on this website but cannot find any solution.

Suppose that I have 2 numpy 2D arrays, a and b, such that:

a = np.array([[0. , 1.],
              [0.5, 1.6],
              [1.2, 3.2],
              [4. , 3. ],
              [4.7, 3.5],
              [5.9, 2.4],
              [7.6, 8.8],
              [9.5, 6.2]
              ])

b = np.array([[7.6, 8.8],
              [4. , 3. ],
              [9.5, 6.2],
              [1.2, 3.2]
              ])

I want to get the arguments of the array b in the array a. That is, for each row of b return its location on b. In this case, the expected result is something like:

args =np.array([6, 3, 7, 2])

I've tried with something like:

args = np.argwhere(a == b) # But the result is an empty array

Any help is appreciated.

1 Answers1

2

Try this:

temp_search = (a[:, None] == b).all(-1)
args = np.where(temp_search.any(0), temp_search.argmax(0), np.nan)
args = args[~np.isnan(args)].astype(int) 

Outputs:

[6 3 7 2]

The issue seems to be in a==b. Instead of returning an np array it's returning just a boolean value i.e. false.

Seems like elementwise comparison using == between 2 numpy arrays is deprecated for a while now.

https://github.com/numpy/numpy/issues/6784

Reference: Check common elements of two 2D numpy arrays, either row or column wise


This is an enhanced version that handles duplicates in arrays, see this answer for reference.

import numpy as np
import pandas as pd

def assign_duplbl(a):
    df = pd.DataFrame(a)
    df['num'] = 1
    return df.groupby(list(range(a.shape[1]))).cumsum().values

def argwhere2d(arr, target_arr):
    # return the location of arr in the target_array

    # this is an updated version to handle duplicates
    a = np.hstack((arr,assign_duplbl(arr)))
    b = np.hstack((target_arr,assign_duplbl(target_arr)))
    temp_search = (b[:, None] == a).all(-1)
    args = np.where(temp_search.any(0), temp_search.argmax(0), np.nan)
    args = args[~np.isnan(args)].astype(int) 
    return args
Rithin Chalumuri
  • 1,739
  • 7
  • 19
  • I got `AxisError: axis 1 is out of bounds for array of dimension 0` –  Nov 05 '19 at 00:43
  • @IamNotaMathematician, are both your array's a and b 2d like you mentioned in the question? – Rithin Chalumuri Nov 05 '19 at 00:49
  • Yes of course they are both 2d. `a.shape is (8,2)` and `b.shape is (4,2)` –  Nov 05 '19 at 00:49
  • I got an empty array. Please try to execute your instructions –  Nov 05 '19 at 00:52
  • @IamNotaMathematician, please check the updated answer. – Rithin Chalumuri Nov 05 '19 at 00:55
  • Again, an empty array. Could you please run your answer? –  Nov 05 '19 at 00:56
  • @IamNotaMathematician, The issue seems to be in `a==b`. Instead of returning an np array it's returning just a boolean value i.e. `false`. Seems like `elementwise` comparission betwen numpy array is deprecated with '=='. We need to find an alternative for this. – Rithin Chalumuri Nov 05 '19 at 01:15
  • I've solved it. –  Nov 05 '19 at 01:41
  • @IamNotaMathematician, I've updated the answer to a solution, that should work now :) – Rithin Chalumuri Nov 05 '19 at 01:42
  • Thanks for the update, Actually your answer is about 16x faster then mine. Could you just explain your code? –  Nov 05 '19 at 01:45
  • There is something wrong with this. If an element in `b` doesn't exist in `a` then it returns `0` for argument. –  Nov 05 '19 at 14:03
  • @IamNotaMathematician, updated the answer so it takes care of missing elements scenario. – Rithin Chalumuri Nov 05 '19 at 14:07
  • This return float dtype array. it should be integers. Also the nan values should be removed from the result with something like this: https://stackoverflow.com/questions/11620914/removing-nan-values-from-an-array –  Nov 05 '19 at 14:13
  • @IamNotaMathematician, you can do `args.astype(int)` to convert floats to int. And you can then remove nan values from the result like in the link you've shown. – Rithin Chalumuri Nov 05 '19 at 14:17