Another way you could do this is just write the function as though for a single 1-D array, ignoring the 2-D aspect:
def f(x):
return np.log(x / x.prod()**(1.0 / len(x)))
Then if you want to apply it to all rows in a 2-D array (or N-D array):
>>> np.apply_along_axis(f, 1, a)
array([[ 0.30409883, -0.10136628, -0.79451346, 0.5917809 ],
[ 0.07192052, -0.62122666, -0.62122666, 1.17053281],
[-0.95166562, 0.65777229, 1.24555895, -0.95166562],
[ 0.59299864, 0.72653003, -0.65976433, -0.65976433],
[-0.07391256, -0.58473818, 0.26255968, 0.39609107]])
Some other general notes on your attempt:
for i in range(len(a))
: If you want to loop over all rows in an array it's generally faster to do simply for row in a
. NumPy can optimize this case somewhat, whereas if you do for idx in range(len(a))
then for each index you have to again index the array with a[idx]
which is slower. But even then it's better not to use a for
loop at all where possible, which you already know.
row = np.array(a[i])
: The np.array()
isn't necessary. If you index an multi-dimensional array the returned value is already an array.
lambda x: math.log(x/geo_mean)
: Don't use math
functions with NumPy arrays. Use the equivalents in the numpy
module. Wrapping this in a function adds unnecessary overhead as well. Since you use this like [flr(x) for x in row]
that's just equivalent to the already vectorized NumPy operations: np.log(row / geo_mean)
.