0

I am working on a python project but stuck in a place which disturbs me for hours . I really need your help!!!

Here is the question:

I got a numpy array X which is very large in size (1300000 × 110) , and I want to delete a batch of rows from the array simultaneously. The indexes of rows for deleting are stored in a python list. Let's say X is the array and lis is the list.

Is there any numpy functions can do this or some other smart tricks?

cs95
  • 379,657
  • 97
  • 704
  • 746
fxy
  • 147
  • 1
  • 1
  • 4
  • By the way , since the array is large , so can the method used to delete rows be fast? – fxy Aug 01 '17 at 01:07
  • In your case `np.delete` creates a `mask=np.ones(nrows, bool)`; sets the delete values to False, `mask[idx]=False`, and returns `your_array[mask,;]`. In other words, it uses a boolean mask to select the rows you want to keep. – hpaulj Aug 01 '17 at 01:51

1 Answers1

2

There is a NumPy function for this, np.delete:

np.delete(arr, indices_to_be_deleted, axis=0)

For example,

In [91]: arr = np.arange(20).reshape(10,2, order='F'); arr
Out[91]: 
array([[ 0, 10],
       [ 1, 11],
       [ 2, 12],
       [ 3, 13],
       [ 4, 14],
       [ 5, 15],
       [ 6, 16],
       [ 7, 17],
       [ 8, 18],
       [ 9, 19]])

In [92]: np.delete(arr, [0,3,4,7], axis=0)
Out[92]: 
array([[ 1, 11],
       [ 2, 12],
       [ 5, 15],
       [ 6, 16],
       [ 8, 18],
       [ 9, 19]])
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • I have try this function but there is a problem with this. I have a lot of rows for deleting , say I need to delete the 6th row , the 9th row, the 100th row...... but once I delete the 6th row using X=np.delete(X,6,0), the indexes for deleting the 9th row now changed into 8th , this is not convenient and computationally slow. – fxy Aug 01 '17 at 01:17
  • @fxy Err, you specify a list of indices... – cs95 Aug 01 '17 at 01:23