1

Right now, I have code that basically looks like:

for x in range(img.shape[0]):
    for y in range(image.shape[1]):
        output[x,y] = map[ input[x,y] ]

where output, input and map are all numpy arrays (map is size 256, all are type uint8).

This works, but it's slow. Loops like this should be in C. That's what numpy is for.

Is there a numpy function (or a cv2 function, I'm already importing that anyway) that will do this?

dspeyer
  • 2,904
  • 1
  • 18
  • 24

2 Answers2

3

How about?

output = map[input]
Eelco Hoogendoorn
  • 10,459
  • 1
  • 44
  • 42
  • No need to flatten and reshape, `map_[input]` just works. (I renamed `map` to `map_` for the usual reasons). – Paul Panzer Nov 29 '18 at 20:34
  • Yep. I don't know how sensitive this is to my system, but `map.take(input)` is consistently a little to a fair bit faster than `map[input]`. – arra Nov 29 '18 at 20:53
  • And either of these is *substantially* faster than `f[a.flatten()].reshape(a.shape)`. – arra Nov 29 '18 at 21:01
  • Ah yeah the reshapes are not required. As for speed; the reshapes are O(1) so if they are noticable all that proves your test arrays are tiny – Eelco Hoogendoorn Nov 29 '18 at 21:03
  • That's what I would have thought, but these were actually based on an image array with something on the order of `n = int(1e8)` elements. Each of `f[a]`, `f.take(a)`, and `f[a.flatten()].reshape(a.shape)` take respectively 687ms, 429ms, and 719ms on average. The differences are starker when you have `a <- a.astype(np.intp)`. In this case both `f[a]` and `f.take(a)` are hover around 110ms and `f[a.flatten()].reshape(a.shape)` is typically greater than 300ms. – arra Nov 29 '18 at 23:06
  • ah; flatten copies, whereas ravel makes a view where it can. That nuance wasnt on my radar; not the most self-descriptive names either... but with ravel I wouldnt expect a difference – Eelco Hoogendoorn Nov 29 '18 at 23:11
0

You're looking for np.take which is as simple as map.take(input). Both this and Eelco's solution are much faster than yours; this is about 70 percent faster than Eelco's system though your mileage may vary, and for input.shape >> (1e4, 1e4) you'll need a better solution.

One place to start is [Performance of various numpy fancy indexing methods, also with numba] (Performance of various numpy fancy indexing methods, also with numba) which details various performance-related facts for this generic problem (i.e. how do we use one k-dimensional array to index some other n-dimensional array in more than just trivial ways).

If you have something like Anaconda installed, you could try to use Numba to jit np.ndarray.take(...) and see how much performance you can buy. The link above explains this as well.

arra
  • 136
  • 3