4

I have a 3D numpy.ndarray (think of an image with RGB) like

a = np.arange(12).reshape(2,2,3)
'''array(
  [[[ 0,  1,  2], [ 3,  4,  5]],
   [[ 6,  7,  8], [ 9, 10, 11]]])'''

and a function that handles a list input;

my_sum = lambda x: x[0] + x[1] + x[2]

What should I do to apply this function to each pixel? (or each 1D element of the 2D array)

What I have tried

np.apply_along_axis

This question is the kind of same as mine. So, I first tried it.

np.apply_along_axis(my_sum, 0, a.T).T #EDIT np.apply_along_axis(my_sum, -1, a) is better

at first, I thought this was the solution but this was too slow, because np.apply_along_axis is not for speed

np.vectorize

I applied np.vetorize to my_func.

vector_my_func = np.vectorize(my_sum)

However, I have no idea even on how this vectorized function can be called.

vector_my_func(0,1,2) 
#=> TypeError: <lambda>() takes 1 positional argument but 3 were given

vector_my_func(np.arange(3)) 
#=> IndexError: invalid index to scalar variable.

vector_my_func(np.arange(12).reshape(4,3)) 
#=> IndexError: invalid index to scalar variable.

vector_my_func(np.arange(12).reshape(2,2,3)) 
#=> IndexError: invalid index to scalar variable.

I am totally at loss on how this should be done.

EDIT

benchmark results for suggested methods. (used jupyter notebook and restarted kernel for each test)

a = np.ones((1000,1000,3))
my_sum = lambda x: x[0] + x[1] + x[2]
my_sum_ellipsis = lambda x: x[..., 0] + x[..., 1] + x[..., 2]
vector_my_sum = np.vectorize(my_sum, signature='(i)->()')
%timeit np.apply_along_axis(my_sum, -1, a)
#1 loop, best of 3: 3.72 s per loop

%timeit vector_my_sum(a)
#1 loop, best of 3: 2.78 s per loop

%timeit my_sum(a.transpose(2,0,1))
#100 loops, best of 3: 12 ms per loop

%timeit my_sum_ellipsis(a)
#100 loops, best of 3: 12.2 ms per loop

%timeit my_sum(np.moveaxis(a, -1, 0))
#100 loops, best of 3: 12.2 ms per loop
Community
  • 1
  • 1
Allosteric
  • 871
  • 1
  • 9
  • 14

2 Answers2

4

One option is to transpose the numpy array, swap the third axis to the first, and then you can apply the function directly to it:

my_sum(a.transpose(2,0,1))

#array([[ 3, 12],
#       [21, 30]])

Or rewrite the sum function as:

my_sum = lambda x: x[..., 0] + x[..., 1] + x[..., 2]
my_sum(a)
#array([[ 3, 12],
#       [21, 30]])
Psidom
  • 209,562
  • 33
  • 339
  • 356
  • thanks for your answers. They both worked very well and very fast. I added to my question, my benchmark results. – Allosteric Mar 28 '17 at 00:10
  • 1
    More clearly, you could do `np.moveaxis(a, -1, 0)`, reading as `move axis -1 to position 0` – Eric Mar 28 '17 at 00:14
  • You're welcome, glad it helps. Thanks for sharing the timing results. – Psidom Mar 28 '17 at 00:16
  • Thank you @Eric. I added the benchmark results. Although it has the same speed as transpose, it seems to be expicit and makes it better(in my sence) – Allosteric Mar 28 '17 at 11:23
2

As of numpy 1.12, vectorize gained a signature argument. So you can use it as:

my_sum = lambda x: x[0] + x[1] + x[2]
vector_my_sum = np.vectorize(my_sum, signature='(i)->()')  # vector to scalar
vector_my_sum(a)

Unfortunately, this is a much slower code path than normal vectorize, which in 1.12 at least runs the for loop in C.

On my machine, with numpy master, this is only about 10% faster than apply_along_axis (although apply_along_axis changed implementation dramatically since 1.12)

Eric
  • 95,302
  • 53
  • 242
  • 374