This is the fastest way I could come up with:
import numpy
x = numpy.arange(1000000, dtype=numpy.int32).reshape((-1,2))
bad = numpy.arange(0, 1000000, 2000, dtype=numpy.int32)
print x.shape
print bad.shape
cleared = numpy.delete(x, numpy.where(numpy.in1d(x[:,0], bad)), 0)
print cleared.shape
This prints:
(500000, 2)
(500,)
(499500, 2)
and runs much faster than a ufunc. It will use some extra memory, but whether that's okay for you depends on how big your array is.
Explanation:
- The numpy.in1d returns an array the same size as
x
containing True
if the element is in the bad
array, and
False
otherwise.
- The numpy.where turns that
True
/False
array into an array of integers containing the index values where the array was True
.
- It then passes the index locations to numpy.delete, telling it to delete along the first axis (0)