1

I have a ndarray of size (M, N, 3). It holds M x N points, each of which is identified by its x, y, and z coordinates. I also have a function 'foo'. 'foo' takes in a point as a ndarray of size (, 3). Is there a faster way to call 'foo' on each of the M x N points in my array than to use two nested for loops?

So far, this is what I have tried. My array is the variable 'sample_array'.

num_rows = sample_array.shape[0]
num_columns = sample_array.shape[1]
solution = np.zeros((num_rows, num_columns))

for row in range(num_rows):
   for column in range(num_columns):
      point = sample_array[row, column]
      solution[row, column] = foo(point)

I found this answer, which describes using these two solutions:

np.vectorize(foo)
np.array(map(foo, sample_array))

However, I am not sure how to specify that I don't want the function mapped to every one of the M x N x 3 floats. Instead, I would like it to map the function to be called on each of the M x N (, 3) ndarrays.

Thank you!

wingedNorthropi
  • 149
  • 2
  • 16
  • 1
    For general function `foo`, the simple answer is **No**. Both `map` and `np.vectorize` are just fancy ways to write a for loop, just like you did. – Quang Hoang Jun 02 '20 at 20:34
  • @QuangHoang Thanks for the answer! Is there a way to use map, np.vectorize, or some other function to at least reduce the number of lines of code required? – wingedNorthropi Jun 02 '20 at 20:36
  • As long the function only accepts 1 point, there isn't much you can do about speed. It has to be called `N*M` times; that's the main time consumer. – hpaulj Jun 02 '20 at 21:01

4 Answers4

1

you can try np.apply_along_axis(function, axis=2, arr=your_input_array) which will slice the array along the 3rd axis and apply the function to each slice, i.e. point.

Find the doc here np.apply_along_axis

CtrlMj
  • 119
  • 7
0

A way to use map is:

solution = np.array(list(map(foo, sample_array.reshape(-1,3)) ))\
             .reshape(sample_array.shape[:2])
Quang Hoang
  • 146,074
  • 10
  • 56
  • 74
0

Ideally, the function foo you're calling will accept arrays as input. In general, it's good style for any function to accept all arrays as long as the first dimension is as expected. For example, the function

import numpy


def length(x):
    return numpy.sqrt(x[0] ** 2 + x[1] ** 2 + x[2] ** 2)

computes the length of a vector, but also accepts arrays of shape (3, ...) (it doesn't even matter how many dimensions follow) and does the right thing.

This design has the following advantages:

Nico Schlömer
  • 53,797
  • 27
  • 201
  • 249
  • Thanks for the answer. So this would require that the input be of shape (3, M, N), correct? Is there a way to achieve this, using your method, that can in the shape (M, N, 3)? Would it be bad practice to store my points in this way? – wingedNorthropi Jun 02 '20 at 20:47
  • It's very common to store arrays as `(n, 3)`, mostly attributed to the fact that `print()`ing looks nicer. It's actually computationally more efficient to keep the leading dimension small. You can of course always use `moveaxis`, but this actually moves around memory and takes time. I'd organize my code such that the shape of the array is `(3, ...)` right from the start. – Nico Schlömer Jun 02 '20 at 20:50
0

In fewer lines of code:

solution = np.vectorize(foo)(*sample_array.reshape(-1, 3).T)
panadestein
  • 1,241
  • 10
  • 21