I want to be able to iterate over the matrix to apply a function to each row. How can I do it for a Numpy matrix ?
Asked
Active
Viewed 2.1e+01k times
68
-
2It is likely that you will get more helpful answers if you explain what you are trying to achieve / what kind of function to apply. Also, you may want to have a look at: http://stackoverflow.com/questions/8079061/function-application-over-numpys-matrix-row-column – root May 09 '13 at 18:42
-
2please post your code. If you haven't tried to do it yet, go try some stuff and post what problems you have – Ryan Saxe May 09 '13 at 18:49
3 Answers
88
You can use numpy.apply_along_axis()
. Assuming that your array is 2D, you can use it like:
import numpy as np
myarray = np.array([[11, 12, 13],
[21, 22, 23],
[31, 32, 33]])
def myfunction(x):
return x[0] + x[1]**2 + x[2]**3
print(np.apply_along_axis(myfunction, axis=1, arr=myarray))
#[ 2352 12672 36992]

Saullo G. P. Castro
- 56,802
- 26
- 179
- 234
-
10if you are using `numpy` functions you can (usually) just specify the axis, like: `mymatrix.sum(axis=1)`. – root May 09 '13 at 19:03
-
1that's right, the sum() in myfunction was just an example, but for some cases, like [here](http://stackoverflow.com/questions/15094619/fitting-a-3d-array-of-data-to-a-1d-function-with-numpy-or-scipy/16315330#16315330), `np.apply_along_axis()` can be very useful – Saullo G. P. Castro May 09 '13 at 19:06
-
1
-
32The problem is that `apply_along_axis` is a Python for loop in disguise. It can give the illusion of numpy performance, but it will not deliver it. In the question you link, using `apply_along_axis` has no benefit over using a for loop. Trying to vectorize whatever function you want to apply to every row is the numpythonic way of doing things. – Jaime May 09 '13 at 19:34
72
While you should certainly provide more information, if you are trying to go through each row, you can just iterate with a for loop:
import numpy
m = numpy.ones((3,5),dtype='int')
for row in m:
print str(row)

Noel Evans
- 8,113
- 8
- 48
- 58

matthew-parlette
- 1,234
- 9
- 5
-
6
-
4
-
2@Brendan It's pretty late, but looping on numpy array is usually expensive because python interpreter and numpy code have to exchange the data every time the loop is executed. – sohnryang Jun 07 '20 at 06:00
-
2@sohnryang, thanks for the response. In the past I've iterated over numpy arrays via indices (e.g. `for i in range(m)`), and that hasn't been a performance bottleneck in my experience up to 100k iterations or so. This thread seems to indicate that the assignment of each row to the `row` variable may be the slow part, so index-based iteration may be the way to go here rather than variable assignment: https://stackoverflow.com/questions/39371021/efficient-loop-over-numpy-array – Brendan Jun 08 '20 at 17:56
-
I think it will produce a wrong answer in case there is a vector as an input. – Royi Aug 24 '22 at 18:38
8
Here's my take if you want to try using multiprocesses to process each row of numpy array,
from multiprocessing import Pool
import numpy as np
def my_function(x):
pass # do something and return something
if __name__ == '__main__':
X = np.arange(6).reshape((3,2))
pool = Pool(processes = 4)
results = pool.map(my_function, map(lambda x: x, X))
pool.close()
pool.join()
pool.map take in a function and an iterable.
I used 'map' function to create an iterator over each rows of the array.
Maybe there's a better to create the iterable though.

hamster ham
- 729
- 8
- 8