how to delete a row or column in numpy array without actually creating a new copy?

Question

I want to delete a particular row or column without actually creating a new copy in python numpy.

Right now i'm doing arr = np.delete(arr, row_or_column_number, axis) but it returns a copy and i have to assign it to it's self everytime.

I was wondering if a more ingenious approached could be used where the change is made to the array itself instead of creating a new copy every time ?

wwl · Answer 1 · 2020-06-06T16:41:25.390

Once a numpy array is created, its size is fixed. To delete (or add) a column or row, a new copy needs to be created.

(Even if numpy had an option to drop columns without reassignment, its likely that another copy would still be created. Another library, Pandas, has the option called "inplace" to delete a column from an object without doing any reassignment, but its use is discouraged, and it doesn't literally prevent a copy from being created. For these reasons, it may be deprecated in the future.)

score 1 · Answer 2 · answered Jun 06 '20 at 16:38

1

Unfortunately, you can't do this using numpy. Array scalars are immutable. See documentation.

Link to a related question: How to remove specific elements in a numpy array

answered Jun 06 '20 at 16:38

Algef Almocera

759
1
6
9

score 1 · Accepted Answer · answered Jun 06 '20 at 17:51

In [114]: x = np.arange(12).reshape(3,4)                                        
In [115]: x.shape                                                               
Out[115]: (3, 4)
In [116]: x.ravel()                                                             
Out[116]: array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

Do you understand how arrays are stored? Basically there's a flat storage of the elements, much like this ravel, and shape and strides. If not, you need to spend some time reading a numpy tutorial.

Delete makes a new array:

In [117]: y = np.delete(x, 1, 0)                                                
In [118]: y                                                                     
Out[118]: 
array([[ 0,  1,  2,  3],
       [ 8,  9, 10, 11]])
In [119]: y.shape                                                               
Out[119]: (2, 4)
In [120]: y.ravel()                                                             
Out[120]: array([ 0,  1,  2,  3,  8,  9, 10, 11])

This delete is the same as selecting 2 rows from x, x[[0,2],:].

Its data elements are different; it has to copy values from x. Whether you assign that back to x doesn't matter. Variable assignment is a trivial python operation. What matters is how the new array is created.

Now in this particular case it is possible to create a view. This is still a new array, but it share memory with x. That's possible because I am selecting a regular pattern, not an arbitrary subset of the rows or columns.

In [121]: x[0::2,:]                                                             
Out[121]: 
array([[ 0,  1,  2,  3],
       [ 8,  9, 10, 11]])

Again, if view doesn't make sense, you need to read more numpy basics. And don't skip the python basics either.

Nice explanation. Though an off topic question, but which IDE do you use for python ? This `[In] [Out]` formatting looks quite intuitive to me . — Hissaan Ali, Jun 06 '20 at 18:35
I usually call `ipython` directly, but `jupyter console` does the same thing. `qtconsole` and `notebook` are other options. — hpaulj, Jun 06 '20 at 21:35

how to delete a row or column in numpy array without actually creating a new copy?

3 Answers3