I want to remove all rows that overlap, but I cannot find a way to do it.
If
a= ([1,2],[3,4],[5,6],[7,8])
b= ([3,4],[7,8])
k = np.delete (a,b,0)
I want something like that to work
I am expecting k = ([3,4],[7,8])
I want to remove all rows that overlap, but I cannot find a way to do it.
If
a= ([1,2],[3,4],[5,6],[7,8])
b= ([3,4],[7,8])
k = np.delete (a,b,0)
I want something like that to work
I am expecting k = ([3,4],[7,8])
You can do this using:
result = [row for row in a if row in b]
print(result)
this will be separated into 2 steps.
first you need to find the indices of rows in a
that overlap with b
.
this can be done this way:
overlap_indices = np.where(np.isin(a, b).all(axis=1))[0]
[1] np.isin(a, b): returns a boolean array of the same shape as a, where each element is True if the corresponding element of a is present in b, and False otherwise, in your case:
[[False False]
[ True True]
[False False]
[ True True]]
[2] .all(axis=1): returns a Boolean array of size a.shape[0] indicates the rows that all it's elements are True, in your case:
[False True False True]
[3] np.where() : eturns the indices of True values in the boolean array obtained by the .all(axis=1), The resulting array is a tuple containing a single element :
(array([1, 3]),)
and the [0] selects the one and only element in that tuple:
[1 3]
then you can use the
k = np.delete(a, overlap_indices, axis=0)
that deletes the elements with the specified indices and returns them in k
print(k) # [[1 2] [5 6]]
sure that could be done using for loops, but this is a vectorized implementation that is much faster if the np array is large