How to remove specific elements in a numpy array

Question

How can I remove some specific elements from a numpy array? Say I have

import numpy as np

a = np.array([1,2,3,4,5,6,7,8,9])

I then want to remove 3,4,7 from a. All I know is the index of the values (index=[2,3,6]).

score 437 · Accepted Answer · edited Aug 19 '23 at 10:43

Use numpy.delete(), which returns a new array with sub-arrays along an axis deleted.

numpy.delete(a, index)

For your specific question:

import numpy as np

a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
index = [2, 3, 6]

new_a = np.delete(a, index)

print(new_a)
# Output: [1, 2, 5, 6, 8, 9]

Note that numpy.delete() returns a new array since array scalars are immutable, similar to strings in Python, so each time a change is made to it, a new object is created. I.e., to quote the delete() docs:

"A copy of arr with the elements specified by obj removed. Note that delete does not occur in-place..."

If the code I post has output, it is the result of running the code.

score 116 · Answer 2 · edited Aug 19 '23 at 10:44

116

Use np.setdiff1d:

import numpy as np
>>> a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> b = np.array([3,4,7])
>>> c = np.setdiff1d(a,b)
>>> c
array([1, 2, 5, 6, 8, 9])

edited Aug 19 '23 at 10:44

Mateen Ulhaq

24,552
19
101
135

answered Apr 06 '16 at 19:33

Zong

1,319
1
8
6

10

Good to know. I was thinking that np.delete would be slower but alas, timeit for 1000 integers says delete is x2 faster. – wbg Jun 09 '16 at 21:52
It is important to note that this is a set difference, so if there are duplicate elements in the array, they will be removed as well. Consider the case where ```a = np.array([-1,1,2,3,4,5,6,7,1,2,3,4,5,6,7,8,9,10,1,2,3,99])``` and ```b=np.array([-1,99])``` then ```c= np.setdiff1d(a,b) => array( [1,2,3,4,5,6,7,8,9,10 ] )``` – athina.bikaki Jan 05 '23 at 21:27

score 55 · Answer 3 · answered Jun 12 '12 at 12:03

55

A Numpy array is immutable, meaning you technically cannot delete an item from it. However, you can construct a new array without the values you don't want, like this:

b = np.delete(a, [2,3,6])

answered Jun 12 '12 at 12:03

Digitalex

1,494
9
11

60

technically, numpy arrays ARE mutable. For example, this: `a[0]=1` modifies `a` in place. But they can not be resized. – btel Oct 23 '14 at 17:16
4

The definition says its immutable, but if by assigning new value it let you modify, then hows it immutable? – Devesh Mar 04 '19 at 04:09

score 51 · Answer 4 · answered Apr 26 '19 at 09:43

51

To delete by value :

modified_array = np.delete(original_array, np.where(original_array == value_to_delete))

answered Apr 26 '19 at 09:43

Prakhar Pandey

629
5
4

2

From numpy 1.19 one can just do: `np.delete(original_array, original_array==value)` https://numpy.org/doc/stable/reference/generated/numpy.delete.html – Alessandro Romancino Jan 15 '23 at 19:26

Andreas K. · Answer 5 · 2020-04-04T06:36:15.560

Using np.delete is the fastest way to do it, if we know the indices of the elements that we want to remove. However, for completeness, let me add another way of "removing" array elements using a boolean mask created with the help of np.isin. This method allows us to remove the elements by specifying them directly or by their indices:

import numpy as np
a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])

Remove by indices:

indices_to_remove = [2, 3, 6]
a = a[~np.isin(np.arange(a.size), indices_to_remove)]

Remove by elements (don't forget to recreate the original a since it was rewritten in the previous line):

elements_to_remove = a[indices_to_remove]  # [3, 4, 7]
a = a[~np.isin(a, elements_to_remove)]

Gareth Latty · Answer 6 · 2012-06-12T12:16:46.327

Not being a numpy person, I took a shot with:

>>> import numpy as np
>>> import itertools
>>> 
>>> a = np.array([1,2,3,4,5,6,7,8,9])
>>> index=[2,3,6]
>>> a = np.array(list(itertools.compress(a, [i not in index for i in range(len(a))])))
>>> a
array([1, 2, 5, 6, 8, 9])

According to my tests, this outperforms numpy.delete(). I don't know why that would be the case, maybe due to the small size of the initial array?

python -m timeit -s "import numpy as np" -s "import itertools" -s "a = np.array([1,2,3,4,5,6,7,8,9])" -s "index=[2,3,6]" "a = np.array(list(itertools.compress(a, [i not in index for i in range(len(a))])))"
100000 loops, best of 3: 12.9 usec per loop

python -m timeit -s "import numpy as np" -s "a = np.array([1,2,3,4,5,6,7,8,9])" -s "index=[2,3,6]" "np.delete(a, index)"
10000 loops, best of 3: 108 usec per loop

That's a pretty significant difference (in the opposite direction to what I was expecting), anyone have any idea why this would be the case?

Even more weirdly, passing numpy.delete() a list performs worse than looping through the list and giving it single indices.

python -m timeit -s "import numpy as np" -s "a = np.array([1,2,3,4,5,6,7,8,9])" -s "index=[2,3,6]" "for i in index:" "    np.delete(a, i)"
10000 loops, best of 3: 33.8 usec per loop

Edit: It does appear to be to do with the size of the array. With large arrays, numpy.delete() is significantly faster.

python -m timeit -s "import numpy as np" -s "import itertools" -s "a = np.array(list(range(10000)))" -s "index=[i for i in range(10000) if i % 2 == 0]" "a = np.array(list(itertools.compress(a, [i not in index for i in range(len(a))])))"
10 loops, best of 3: 200 msec per loop

python -m timeit -s "import numpy as np" -s "a = np.array(list(range(10000)))" -s "index=[i for i in range(10000) if i % 2 == 0]" "np.delete(a, index)"
1000 loops, best of 3: 1.68 msec per loop

Obviously, this is all pretty irrelevant, as you should always go for clarity and avoid reinventing the wheel, but I found it a little interesting, so I thought I'd leave it here.

Be careful with what you actually compare! You have `a = delte_stuff(a)` in your first iteration, which makes `a` smaller with every iteration. When you use the inbuild function, you don't store the value back to a, which keeps a in the original size! Besides that, you can speed up your function drastically, when you create a set ouf of `index` and check against that, whether or not to delete an item. Fixing both things, I get for 10k items: 6.22 msec per loop with your function, 4.48 msec for `numpy.delete`, which is roughly what you would expect. — Michael, Jan 20 '13 at 05:30
Two more hints: Instead of `np.array(list(range(x)))` use `np.arange(x)`, and for creating the index, you can use `np.s_[::2]`. — Michael, Jan 20 '13 at 05:41

score 6 · Answer 7 · answered Oct 16 '20 at 13:56

In case you don't have the indices of the elements you want to remove, you can use the function in1d provided by numpy.

The function returns True if the element of a 1-D array is also present in a second array. To delete the elements, you just have to negate the values returned by this function.

Notice that this method keeps the order from the original array.

In [1]: import numpy as np

        a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
        rm = np.array([3, 4, 7])
        # np.in1d return true if the element of `a` is in `rm`
        idx = np.in1d(a, rm)
        idx

Out[1]: array([False, False,  True,  True, False, False,  True, False, False])

In [2]: # Since we want the opposite of what `in1d` gives us, 
        # you just have to negate the returned value
        a[~idx]

Out[2]: array([1, 2, 5, 6, 8, 9])

score 2 · Answer 8 · answered Nov 22 '17 at 05:22

2

If you don't know the index, you can't use logical_and

x = 10*np.random.randn(1,100)
low = 5
high = 27
x[0,np.logical_and(x[0,:]>low,x[0,:]<high)]

answered Nov 22 '17 at 05:22

idnavid

1,795
17
20

score 2 · Answer 9 · answered Jul 21 '19 at 12:09

2

Remove specific index(i removed 16 and 21 from matrix)

import numpy as np
mat = np.arange(12,26)
a = [4,9]
del_map = np.delete(mat, a)
del_map.reshape(3,4)

Output:

array([[12, 13, 14, 15],
      [17, 18, 19, 20],
      [22, 23, 24, 25]])

answered Jul 21 '19 at 12:09

Raja Ahsan Zeb

31
3

score 2 · Answer 10 · answered Sep 02 '20 at 20:08

2

list comprehension could be an interesting approach as well.

a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
index = np.array([2, 3, 6]) #index is changed to an array.  
out = [val for i, val in enumerate(a) if all(i != index)]
>>> [1, 2, 5, 6, 8, 9]

answered Sep 02 '20 at 20:08

Mauricio Arboleda-Zapata

395
2
9

OlDor · Answer 11 · 2019-09-25T14:09:04.427

1

You can also use sets:

a = numpy.array([10, 20, 30, 40, 50, 60, 70, 80, 90])
the_index_list = [2, 3, 6]

the_big_set = set(numpy.arange(len(a)))
the_small_set = set(the_index_list)
the_delta_row_list = list(the_big_set - the_small_set)

a = a[the_delta_row_list]

edited Sep 25 '19 at 14:09

answered Sep 25 '19 at 13:34

OlDor

1,460
12
18

keramat · Answer 12 · 2021-10-25T08:48:34.440

1

Filter the part that you do not need:

import numpy as np
a = np.array([1,2,3,4,5,6,7,8,9])
a = a[(a!=3)&(a!=4)&(a!=7)]

If you have a list of indices to be removed:

to_be_removed_inds = [2,3,6]
a = np.array([1,2,3,4,5,6,7,8,9])
a = a[[x for x in range(len(a)) if x not in to_be_removed]]

edited Oct 25 '21 at 08:48

answered Oct 03 '21 at 08:50

keramat

4,328
6
25
38

score 0 · Answer 13 · answered Jan 15 '23 at 19:35

0

If you do not know the indices now you can do something like this:

arr = [1, 2, 3, 4, 5, 6, 7, 8, 9]
values = [3, 4, 7]
mask = np.isin(arr, values)
arr = np.delete(arr, mask)

This syntax with mask was introduced in 1.19.

answered Jan 15 '23 at 19:35

Alessandro Romancino

85
10

How to remove specific elements in a numpy array

13 Answers13

Linked

Related