2

I have an array "removable" containing a few numbers from another array "All" containing all numbers from 0 to k.

I want to remove all numbers in A which are listed in removable.

All = np.arange(k)
removable = np.ndarray([1, 3, 4 , 7, 9, ..., 200])

for i in removable:
    if i in All:
        All.remove(i)

ndarray has no remove attribute, but I'm sure there is an easy method in numpy to solve this problem, but I can't find it in the documentation.

Jason Aller
  • 3,541
  • 28
  • 38
  • 38
Tim4497
  • 340
  • 3
  • 19

5 Answers5

5

You could use the function setdiff1d from NumPy:

>>> a = np.array([1, 2, 3, 2, 4, 1])
>>> b = np.array([3, 4, 5, 6])
>>> np.setdiff1d(a, b)
array([1, 2])
Brad Solomon
  • 38,521
  • 31
  • 149
  • 235
f.wue
  • 837
  • 8
  • 15
  • 2
    Note that this will de-duplicate the original entries in `a` (not done by the pseudocode in the question), and the result will be sorted – Brad Solomon Feb 05 '19 at 14:55
  • Thats true, however the np.arange(k) provides a list without duplicates. My answer will not work with duplicates. – f.wue Feb 05 '19 at 14:57
  • Oh snap, `setdiff1d` is even faster than explicit set conversion and differencing. I guess that makes sense, probably more optimized. I didn't know numpy had this! – Engineero Feb 05 '19 at 15:07
  • Now the question is if OP wants duplicates or deduplicates – Martin Feb 05 '19 at 15:16
2

np.setdiff1d() will de-duplicate the original entries, and will also return the result sorted.

That's fine in some cases, but if you want to avoid one or both of these aspects, have a look at np.in1d() with an (inverted) boolean mask:

>>> a = np.array([1, 2, 3, 2, 4, 1])                                                                                                                                                                                                                    
>>> b = np.array([3, 4, 5, 6])                                                                                                                                                                                                                          
>>> a[~np.in1d(a, b)]                                                                                                                                                                                                                                   
array([1, 2, 2, 1])

The ~ operator does inversion on the boolean mask:

>>> np.in1d(a, b)                                                                                                                                                                                                                                       
array([False, False,  True, False,  True, False])

>>> ~np.in1d(a, b)                                                                                                                                                                                                                                      
array([ True,  True, False,  True, False,  True])

Disclaimer:

Note that this is not truly removal, as you indicated in your question; the result is a view into filtered elements of the original array a. Same goes for np.delete(); there's no concept of in-place element deletion for NumPy arrays.

Brad Solomon
  • 38,521
  • 31
  • 149
  • 235
1

Solution - fast for big arrays, no need to transform into list (slowing down computation)

orig=np.arange(15)
to_remove=np.array([1,2,3,4])
mask = np.isin(orig, to_remove)
orig=orig[np.invert(mask)]

>>> orig
array([ 0,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])
Martin
  • 3,333
  • 2
  • 18
  • 39
  • 1
    `np.isin()` calls `np.asarray()` + `np.in1d()` and does reshaping. If both of the inputs are 1d, those checks are probably not needed – Brad Solomon Feb 05 '19 at 15:03
  • no need to say its impossible or to give different solution than what is being asked from OP – Martin Feb 05 '19 at 15:09
  • 1
    constructive criticism always welcome. We are all fellow programmers here, some having learnt a lot already, others just starting. Help correct mistakes or dont, but there is absolutely no need to mock them. – Paritosh Singh Feb 05 '19 at 15:11
-1

numpy arrays have a fixed shape, you cannot remove elements from them.

You cannot do this with ndarrays.

Mihai Andrei
  • 1,024
  • 8
  • 11
  • Upvote to counter the downvotes. Perhaps pedantic, but not incorrect. And not an unimportant aspect about ndarrays to appreciate, on the path to a good solution to this problem. – Eelco Hoogendoorn Feb 05 '19 at 15:10
-1

You should do this with sets instead of lists/arrays, which is easy enough:

remaining = np.array(set(arr).difference(removable))

where arr is your All array above ("all" is a keyword and should not be overwritten).

Granted, using sets will get rid of repeated elements if you have those in your arr, but it sounds like arr is just a sequence of unique values. Sets have much more efficient membership checking (constant-time vs. order N), so you get to go a lot faster. By comparison, I made a list version that builds a list if a value is not in removable:

def remove_list(arr, rem):
    result = []
    for i in arr:
        if i not in rem:
            result.append(i)
    return result

and made my set version a function as well:

def remove_set(arr, rem):
    return np.array(set(arr).difference(rem))

Timing comparison with arr = np.arange(10000) and removable = np.random.randint(0, 10000, 1000):

remove_list(arr, removable)
# 55.5 ms ± 664 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

remove_set(arr, removable)
# 947 µs ± 3.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Set is 50 times faster.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Engineero
  • 12,340
  • 5
  • 53
  • 75