0

I want to delete a particular row or column without actually creating a new copy in python numpy.

Right now i'm doing arr = np.delete(arr, row_or_column_number, axis) but it returns a copy and i have to assign it to it's self everytime.

I was wondering if a more ingenious approached could be used where the change is made to the array itself instead of creating a new copy every time ?

Hissaan Ali
  • 2,229
  • 4
  • 25
  • 51

3 Answers3

1

Once a numpy array is created, its size is fixed. To delete (or add) a column or row, a new copy needs to be created.

(Even if numpy had an option to drop columns without reassignment, its likely that another copy would still be created. Another library, Pandas, has the option called "inplace" to delete a column from an object without doing any reassignment, but its use is discouraged, and it doesn't literally prevent a copy from being created. For these reasons, it may be deprecated in the future.)

wwl
  • 2,025
  • 2
  • 30
  • 51
1

Unfortunately, you can't do this using numpy. Array scalars are immutable. See documentation.

Link to a related question: How to remove specific elements in a numpy array

Algef Almocera
  • 759
  • 1
  • 6
  • 9
1
In [114]: x = np.arange(12).reshape(3,4)                                        
In [115]: x.shape                                                               
Out[115]: (3, 4)
In [116]: x.ravel()                                                             
Out[116]: array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

Do you understand how arrays are stored? Basically there's a flat storage of the elements, much like this ravel, and shape and strides. If not, you need to spend some time reading a numpy tutorial.

Delete makes a new array:

In [117]: y = np.delete(x, 1, 0)                                                
In [118]: y                                                                     
Out[118]: 
array([[ 0,  1,  2,  3],
       [ 8,  9, 10, 11]])
In [119]: y.shape                                                               
Out[119]: (2, 4)
In [120]: y.ravel()                                                             
Out[120]: array([ 0,  1,  2,  3,  8,  9, 10, 11])

This delete is the same as selecting 2 rows from x, x[[0,2],:].

Its data elements are different; it has to copy values from x. Whether you assign that back to x doesn't matter. Variable assignment is a trivial python operation. What matters is how the new array is created.

Now in this particular case it is possible to create a view. This is still a new array, but it share memory with x. That's possible because I am selecting a regular pattern, not an arbitrary subset of the rows or columns.

In [121]: x[0::2,:]                                                             
Out[121]: 
array([[ 0,  1,  2,  3],
       [ 8,  9, 10, 11]])

Again, if view doesn't make sense, you need to read more numpy basics. And don't skip the python basics either.

hpaulj
  • 221,503
  • 14
  • 230
  • 353