4

when calling the "np.delete()", I am not interested to define a new variable for the reduced size array. I want to execute the delete on the original numpy array. Any thought?

>>> arr = np.array([[1,2], [5,6], [9,10]])
>>> arr
array([[ 1,  2],
       [ 5,  6],
       [ 9, 10]])
>>> np.delete(arr, 1, 0)
array([[ 1,  2],
       [ 9, 10]])
>>> arr
array([[ 1,  2],
       [ 5,  6],
       [ 9, 10]])
but I want:
>>> arr
array([[ 1,  2],
       [ 9, 10]])
P.J
  • 197
  • 4
  • 15

5 Answers5

4

NumPy arrays are fixed-size, so there can't be an in-place version of np.delete. Any such function would have to change the array's size.

The closest you can get is reassigning the arr variable:

arr = numpy.delete(arr, 1, 0)
user2357112
  • 260,549
  • 28
  • 431
  • 505
1

The delete call doesn't modify the original array, it copies it and returns the copy after the deletion is done.

>>> arr1 = np.array([[1,2], [5,6], [9,10]])
>>> arr2 = np.delete(arr, 1, 0)
>>> arr1
array([[ 1,  2],
   [ 5,  6],
   [ 9, 10]])
>>> arr2 
array([[ 1,  2],
   [ 9, 10]])
Olian04
  • 6,480
  • 2
  • 27
  • 54
1

If its a matter of performance you might want to try (but test it since I'm not sure) creating a view* instead of of using np.delete. You can do it by slicing which should be an inplace operation:

import numpy as np

arr = np.array([[1,  2], [5,  6], [9, 10]])
arr = arr[(0, 2), :]
print(arr)

resulting in:

[[ 1  2]
 [ 9 10]]

This, however, will not free the memory occupied from the excluded row. It might increase performance but memory wise you might have the same or worse problem. Also notice that, as far as I know, there is no way of indexing by exclusion (for instance arr[~1] would be very useful) which will necessarily make you spend resources in building an indexation array.

For most cases I think the suggestion other users have given, namely:

arr = numpy.delete(arr, 1, 0)

, is the best. In some cases it might be worth exploring the other alternative.

EDIT: *This is actually incorrect (thanks @user2357112). Fancy indexing does not create a view but instead returns a copy as can be seen in the documentation (which I should have checked before jumping to conclusions, sorry about that):

Advanced indexing always returns a copy of the data (contrast with basic slicing that returns a view).

As so I'm unsure if the fancy indexing suggestion might be worth something as an actual suggestion unless it has any performance gain against the np.delete method (which I'll try to verify when opportunity arises, see EDIT2).

EDIT2: I performed a very simple test to see if there is any perfomance gain from using fancy indexing by opposition to delete function. Used timeit (actually the first time I've used but it seems the number of executions per snippet is 1 000 000, thus the hight numbers for time):

import numpy as np
import timeit

def test1():
    arr = np.array([[1, 2], [5, 6], [9, 10]])
    arr = arr[(0, 2), :]

def test2():
    arr = np.array([[1, 2], [5, 6], [9, 10]])
    arr = np.delete(arr, 1, 0)

print("Equality test: ", test1() == test2())

print(timeit.timeit("test1()", setup="from __main__ import test1"))
print(timeit.timeit("test2()", setup="from __main__ import test2"))

The results are these:

Equality test:  True
5.43569152576767
9.476918448174644

Which represents a very considerable speed gain. Nevertheless notice that building the sequence for the fancy indexing will take time. If it is worth or not will surely depend on the problem being solved.

armatita
  • 12,825
  • 8
  • 48
  • 49
  • 1
    This doesn't actually create a view. Indexing operations classified as advanced indexing, such as what you get with that `(0, 2)`, don't produce views, since they don't produce the consistent strides necessary to create a view. – user2357112 Nov 04 '16 at 18:37
  • @user2357112 True. I should have check the [documentation](https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#advanced-indexing) first. My mistake, I'll edit the post. Do you have any idea if performance wise this choice might be faster? My suggestion would be pretty useless if its not. – armatita Nov 04 '16 at 22:05
  • I think it might avoid some of the overhead `numpy.delete` has. – user2357112 Nov 04 '16 at 23:34
1

You could implement your own version of delete which copies data elements after the elements to be deleted forward, and then returns a view excluding the (now obsolete) last element:

import numpy as np


# in-place delete
def np_delete(arr, obj, axis=None):
    # this is a only simplified example
    assert (isinstance(obj, int))
    assert (axis is None)

    for i in range(obj + 1, arr.size):
        arr[i - 1] = arr[i]
    return arr[:-1]


Test = 10 * np.arange(10)
print(Test)

deleteIndex = 5
print(np.delete(Test, deleteIndex))
print(np_delete(Test, deleteIndex))
bers
  • 4,817
  • 2
  • 40
  • 59
  • 1
    I use this in an algorithm where a row and column gets deleted in each step. This solution is already faster than using `numpy.delete` there. Using the `@jit` decorator from the Numba module on this function makes it even faster still. – Jens Renders Mar 24 '19 at 17:56
  • @JensRenders it sounds like you are deleting multiple items - I expect you are taking all deletions into account at once and copy each element only once? Sounds like the implementation I was to lazy to try and post here :) good to know it's faster in some cases. I guess you also copy data block-wise instead of element by element as I imply above? – bers Mar 24 '19 at 21:04
  • One more thought: is your implementation faster also in the worst case, deleting elements from the beginning? – bers Mar 24 '19 at 21:05
  • Yes, shifting blocks makes more sense in the code. After using the `@jit` decorater, it doesn't make a difference anymore though. My current code runs faster then numpy.delete in any case, because it doesn't need to allocate new memory. If I time my code and the allocation of some useless memory, it is about the same speed as numpy.delete – Jens Renders Mar 24 '19 at 22:08
0

Nothing wrong in your code. you just have to override the variable

    arr = np.array([[1,2], [5,6], [9,10]])
    arr = np.delete(arr, 1, 0)
Januka samaranyake
  • 2,385
  • 1
  • 28
  • 50
  • 1
    This is *not* changing the actual object. You are just pointing the `arr` name at a new object. See the other replies for this answer. – Hannes Ovrén Nov 04 '16 at 16:27