99

I have a 2D matrix and I want to take norm of each row. But when I use numpy.linalg.norm(X) directly, it takes the norm of the whole matrix.

I can take norm of each row by using a for loop and then taking norm of each X[i], but it takes a huge time since I have 30k rows.

Any suggestions to find a quicker way? Or is it possible to apply np.linalg.norm to each row of a matrix?

nbro
  • 15,395
  • 32
  • 113
  • 196

5 Answers5

101

For numpy 1.9+

Note that, as perimosocordiae shows, as of NumPy version 1.9, np.linalg.norm(x, axis=1) is the fastest way to compute the L2-norm.

For numpy < 1.9

If you are computing an L2-norm, you could compute it directly (using the axis=-1 argument to sum along rows):

np.sum(np.abs(x)**2,axis=-1)**(1./2)

Lp-norms can be computed similarly of course.

It is considerably faster than np.apply_along_axis, though perhaps not as convenient:

In [48]: %timeit np.apply_along_axis(np.linalg.norm, 1, x)
1000 loops, best of 3: 208 us per loop

In [49]: %timeit np.sum(np.abs(x)**2,axis=-1)**(1./2)
100000 loops, best of 3: 18.3 us per loop

Other ord forms of norm can be computed directly too (with similar speedups):

In [55]: %timeit np.apply_along_axis(lambda row:np.linalg.norm(row,ord=1), 1, x)
1000 loops, best of 3: 203 us per loop

In [54]: %timeit np.sum(abs(x), axis=-1)
100000 loops, best of 3: 10.9 us per loop
ddejohn
  • 8,775
  • 3
  • 17
  • 30
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • 10
    Why do you do np.abs(x) if you square x anyway? – Patrick Jan 03 '13 at 15:17
  • 12
    @Patrick: If the dtype of `x` is complex, then it makes a difference. For example, if `x = np.array([(1+1j,2+1j)])` then `np.sum(np.abs(x)**2,axis=-1)**(1./2)` is `array([ 2.64575131])`, while `np.sum(x**2,axis=-1)**(1./2)` is `array([ 2.20320266+1.36165413j])`. – unutbu Jan 03 '13 at 20:15
  • 4
    @perimosocordiae [posted](http://stackoverflow.com/a/19794741/1959808) an update that [`numpy.linalg.norm`](http://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.norm.html) with its [new](https://github.com/numpy/numpy/pull/3387) `axis` argument is currently the fastest approach. – 0 _ Nov 18 '13 at 08:32
  • How to do the same If I want to apply norm column-wise to a matrix? – Gunjan naik Jul 23 '15 at 09:42
  • @user3515225: `np.linalg.norm(x, axis=0)`. The `axis` refers to the axis being summed over. For a 2D array, the 0-axis refers to rows, so `axis=0` causes `norm` to sum down the rows for each fixed column. – unutbu Jul 23 '15 at 09:56
60

Resurrecting an old question due to a numpy update. As of the 1.9 release, numpy.linalg.norm now accepts an axis argument. [code, documentation]

This is the new fastest method in town:

In [10]: x = np.random.random((500,500))

In [11]: %timeit np.apply_along_axis(np.linalg.norm, 1, x)
10 loops, best of 3: 21 ms per loop

In [12]: %timeit np.sum(np.abs(x)**2,axis=-1)**(1./2)
100 loops, best of 3: 2.6 ms per loop

In [13]: %timeit np.linalg.norm(x, axis=1)
1000 loops, best of 3: 1.4 ms per loop

And to prove it's calculating the same thing:

In [14]: np.allclose(np.linalg.norm(x, axis=1), np.sum(np.abs(x)**2,axis=-1)**(1./2))
Out[14]: True
perimosocordiae
  • 17,287
  • 14
  • 60
  • 76
27

Much faster than the accepted answer is using NumPy's einsum,

numpy.sqrt(numpy.einsum('ij,ij->i', a, a))

And even faster than that is arranging the data such that the norms are computed across all columns,

numpy.sqrt(numpy.einsum('ij,ij->j', aT, aT))

Note the log-scale:

enter image description here


Code to reproduce the plot:

import numpy as np
import perfplot

rng = np.random.default_rng(0)


def setup(n):
    x = rng.random((n, 3))
    xt = np.ascontiguousarray(x.T)
    return x, xt


def sum_sqrt(a, _):
    return np.sqrt(np.sum(np.abs(a) ** 2, axis=-1))


def apply_norm_along_axis(a, _):
    return np.apply_along_axis(np.linalg.norm, 1, a)


def norm_axis(a, _):
    return np.linalg.norm(a, axis=1)


def einsum_sqrt(a, _):
    return np.sqrt(np.einsum("ij,ij->i", a, a))


def einsum_sqrt_columns(_, aT):
    return np.sqrt(np.einsum("ij,ij->j", aT, aT))


b = perfplot.bench(
    setup=setup,
    kernels=[
        sum_sqrt,
        apply_norm_along_axis,
        norm_axis,
        einsum_sqrt,
        einsum_sqrt_columns,
    ],
    n_range=[2**k for k in range(20)],
    xlabel="len(a)",
)
b.show()
b.save("out.png")
Nico Schlömer
  • 53,797
  • 27
  • 201
  • 249
7

Try the following:

In [16]: numpy.apply_along_axis(numpy.linalg.norm, 1, a)
Out[16]: array([ 5.38516481,  1.41421356,  5.38516481])

where a is your 2D array.

The above computes the L2 norm. For a different norm, you could use something like:

In [22]: numpy.apply_along_axis(lambda row:numpy.linalg.norm(row,ord=1), 1, a)
Out[22]: array([9, 2, 9])
NPE
  • 486,780
  • 108
  • 951
  • 1,012
0

As simple as this, when your 2D numpy array is x:

x_unit = x / np.linalg.norm(x, axis=1, keepdims=True)
Peyman
  • 3,097
  • 5
  • 33
  • 56