It looks like indexing numpy record arrays with an array of indices is outrageously slow. However, the same operation can be performed using np.view
10-15 times faster.
Is there a reason behind this difference? Why isn't indexing of record arrays implemented in a faster way? (see also sorting numpy structured and record arrays is very slow)
mydtype = np.dtype("i4,i8")
mydtype.names = ("foo","bar")
N = 100000
foobar = np.zeros(N,dtype = mydtype)
foobar["foo"] = np.random.randint(0,100,N)
foobar["bar"] = np.random.randint(0,10000,N)
b = np.lexsort((foobar["foo"],foobar["bar"]))
timeit foobar[b]
100 loops, best of 3: 11.2 ms per loop
timeit foobar.view("|S12")[b].view(mydtype)
1000 loops, best of 3: 882 µs per loop
Obviously, both results give the same answer.