Delete all numbers from an array which are in another array

Question

I have an array "removable" containing a few numbers from another array "All" containing all numbers from 0 to k.

I want to remove all numbers in A which are listed in removable.

All = np.arange(k)
removable = np.ndarray([1, 3, 4 , 7, 9, ..., 200])

for i in removable:
    if i in All:
        All.remove(i)

ndarray has no remove attribute, but I'm sure there is an easy method in numpy to solve this problem, but I can't find it in the documentation.

I get the removable from another method, sadly im not able to change it. — Tim4497, Feb 05 '19 at 14:51

score 5 · Accepted Answer · edited Feb 05 '19 at 14:58

5

You could use the function setdiff1d from NumPy:

>>> a = np.array([1, 2, 3, 2, 4, 1])
>>> b = np.array([3, 4, 5, 6])
>>> np.setdiff1d(a, b)
array([1, 2])

edited Feb 05 '19 at 14:58

Brad Solomon

38,521
31
149
235

answered Feb 05 '19 at 14:52

f.wue

837
8
15

2

Note that this will de-duplicate the original entries in `a` (not done by the pseudocode in the question), and the result will be sorted – Brad Solomon Feb 05 '19 at 14:55
Thats true, however the np.arange(k) provides a list without duplicates. My answer will not work with duplicates. – f.wue Feb 05 '19 at 14:57
Oh snap, `setdiff1d` is even faster than explicit set conversion and differencing. I guess that makes sense, probably more optimized. I didn't know numpy had this! – Engineero Feb 05 '19 at 15:07
Now the question is if OP wants duplicates or deduplicates – Martin Feb 05 '19 at 15:16

Brad Solomon · Answer 2 · 2019-02-05T15:21:40.323

np.setdiff1d() will de-duplicate the original entries, and will also return the result sorted.

That's fine in some cases, but if you want to avoid one or both of these aspects, have a look at np.in1d() with an (inverted) boolean mask:

>>> a = np.array([1, 2, 3, 2, 4, 1])                                                                                                                                                                                                                    
>>> b = np.array([3, 4, 5, 6])                                                                                                                                                                                                                          
>>> a[~np.in1d(a, b)]                                                                                                                                                                                                                                   
array([1, 2, 2, 1])

The ~ operator does inversion on the boolean mask:

>>> np.in1d(a, b)                                                                                                                                                                                                                                       
array([False, False,  True, False,  True, False])

>>> ~np.in1d(a, b)                                                                                                                                                                                                                                      
array([ True,  True, False,  True, False,  True])

Disclaimer:

Note that this is not truly removal, as you indicated in your question; the result is a view into filtered elements of the original array a. Same goes for np.delete(); there's no concept of in-place element deletion for NumPy arrays.

Martin · Answer 3 · 2019-02-05T15:09:14.753

1

Solution - fast for big arrays, no need to transform into list (slowing down computation)

orig=np.arange(15)
to_remove=np.array([1,2,3,4])
mask = np.isin(orig, to_remove)
orig=orig[np.invert(mask)]

>>> orig
array([ 0,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

edited Feb 05 '19 at 15:09

answered Feb 05 '19 at 15:01

Martin

3,333
2
18
39

1

`np.isin()` calls `np.asarray()` + `np.in1d()` and does reshaping. If both of the inputs are 1d, those checks are probably not needed – Brad Solomon Feb 05 '19 at 15:03
no need to say its impossible or to give different solution than what is being asked from OP – Martin Feb 05 '19 at 15:09
1

constructive criticism always welcome. We are all fellow programmers here, some having learnt a lot already, others just starting. Help correct mistakes or dont, but there is absolutely no need to mock them. – Paritosh Singh Feb 05 '19 at 15:11

score -1 · Answer 4 · answered Feb 05 '19 at 14:55

-1

numpy arrays have a fixed shape, you cannot remove elements from them.

You cannot do this with ndarrays.

answered Feb 05 '19 at 14:55

Mihai Andrei

1,024
8
11

Upvote to counter the downvotes. Perhaps pedantic, but not incorrect. And not an unimportant aspect about ndarrays to appreciate, on the path to a good solution to this problem. – Eelco Hoogendoorn Feb 05 '19 at 15:10

score -1 · Answer 5 · edited Feb 08 '19 at 12:59

You should do this with sets instead of lists/arrays, which is easy enough:

remaining = np.array(set(arr).difference(removable))

where arr is your All array above ("all" is a keyword and should not be overwritten).

Granted, using sets will get rid of repeated elements if you have those in your arr, but it sounds like arr is just a sequence of unique values. Sets have much more efficient membership checking (constant-time vs. order N), so you get to go a lot faster. By comparison, I made a list version that builds a list if a value is not in removable:

def remove_list(arr, rem):
    result = []
    for i in arr:
        if i not in rem:
            result.append(i)
    return result

and made my set version a function as well:

def remove_set(arr, rem):
    return np.array(set(arr).difference(rem))

Timing comparison with arr = np.arange(10000) and removable = np.random.randint(0, 10000, 1000):

remove_list(arr, removable)
# 55.5 ms ± 664 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

remove_set(arr, removable)
# 947 µs ± 3.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Set is 50 times faster.

Delete all numbers from an array which are in another array

5 Answers5