115

Given a 3 times 3 numpy array

a = numpy.arange(0,27,3).reshape(3,3)

# array([[ 0,  3,  6],
#        [ 9, 12, 15],
#        [18, 21, 24]])

To normalize the rows of the 2-dimensional array I thought of

row_sums = a.sum(axis=1) # array([ 9, 36, 63])
new_matrix = numpy.zeros((3,3))
for i, (row, row_sum) in enumerate(zip(a, row_sums)):
    new_matrix[i,:] = row / row_sum

There must be a better way, isn't there?

Perhaps to clearify: By normalizing I mean, the sum of the entrys per row must be one. But I think that will be clear to most people.

Aufwind
  • 25,310
  • 38
  • 109
  • 154
  • 19
    Careful, "normalize" usually means the *square* sum of components is one. Your definition will hardly be clear to most people;) – coldfix Jul 13 '15 at 18:10
  • 5
    @coldfix speaks about `L2` norm and considers it as most common (which may be true) while Aufwind uses `L1` norm which is also a norm indeed. – Bálint Sass Feb 12 '21 at 09:50

12 Answers12

173

Broadcasting is really good for this:

row_sums = a.sum(axis=1)
new_matrix = a / row_sums[:, numpy.newaxis]

row_sums[:, numpy.newaxis] reshapes row_sums from being (3,) to being (3, 1). When you do a / b, a and b are broadcast against each other.

You can learn more about broadcasting here or even better here.

Daniel Fischer
  • 181,706
  • 17
  • 308
  • 431
Bi Rico
  • 25,283
  • 3
  • 52
  • 75
  • 41
    This can be simplified even further using `a.sum(axis=1, keepdims=True)` to keep the singleton column dimension, which you can then broadcast along without having to use `np.newaxis`. – ali_m Apr 23 '15 at 13:26
  • 9
    what if any of the row_sums is zero? – asdf Apr 24 '15 at 23:31
  • @asdf ...well in that case normalizing by the row sum doesn't really make much sense! – ali_m Apr 25 '15 at 19:25
  • 12
    This is the correct answer for the question as stated above - but if a normalization in the usual sense is desired, use `np.linalg.norm` instead of `a.sum`! – coldfix Jul 13 '15 at 18:12
  • 2
    is this preferred to `row_sums.reshape(3,1)` ? – Paul Aug 10 '15 at 02:09
  • 1
    It's not as robust since the row sum may be 0. – nos Jun 08 '16 at 22:48
  • If a vector is normalized, it should have a unit norm, using a / row_sums[:, numpy.newaxis] really doesn't guarantee a unit norm. – XY.W Jan 12 '17 at 09:37
  • @XY.W There are many definitions of "unit norm", take a look at the ord argument to [numpy's norm function](https://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.linalg.norm.html). Ord 1 norms are often useful and the OP asked specifically about normalizing with respect to this norm, but you can of course replace the denominator with the most appropriate norm for your application. – Bi Rico Jan 13 '17 at 20:29
  • Is this the same as MinMaxNorm or what is the name of this normalization? – Mona Jalal Sep 23 '17 at 00:15
  • This is equivalent to `new_matrix = a / row_sums[:, None]`, as `None` can be used as a shorthand for `np.newaxis`. – johannesack May 07 '21 at 12:02
133

Scikit-learn offers a function normalize() that lets you apply various normalizations. The "make it sum to 1" is called L1-norm. Therefore:

from sklearn.preprocessing import normalize

matrix = numpy.arange(0,27,3).reshape(3,3).astype(numpy.float64)
# array([[  0.,   3.,   6.],
#        [  9.,  12.,  15.],
#        [ 18.,  21.,  24.]])

normed_matrix = normalize(matrix, axis=1, norm='l1')
# [[ 0.          0.33333333  0.66666667]
#  [ 0.25        0.33333333  0.41666667]
#  [ 0.28571429  0.33333333  0.38095238]]

Now your rows will sum to 1.

normanius
  • 8,629
  • 7
  • 53
  • 83
rogueleaderr
  • 4,671
  • 2
  • 33
  • 40
  • 3
    This also has the advantage that it works on sparse arrays that would not fit into memory as dense arrays. – JEM_Mosig Jan 29 '20 at 19:58
11

I think this should work,

a = numpy.arange(0,27.,3).reshape(3,3)

a /=  a.sum(axis=1)[:,numpy.newaxis]
tom10
  • 67,082
  • 10
  • 127
  • 137
6

In case you are trying to normalize each row such that its magnitude is one (i.e. a row's unit length is one or the sum of the square of each element in a row is one):

import numpy as np

a = np.arange(0,27,3).reshape(3,3)

result = a / np.linalg.norm(a, axis=-1)[:, np.newaxis]
# array([[ 0.        ,  0.4472136 ,  0.89442719],
#        [ 0.42426407,  0.56568542,  0.70710678],
#        [ 0.49153915,  0.57346234,  0.65538554]])

Verifying:

np.sum( result**2, axis=-1 )
# array([ 1.,  1.,  1.]) 
walt
  • 71
  • 1
  • 3
4

I think you can normalize the row elements sum to 1 by this: new_matrix = a / a.sum(axis=1, keepdims=1). And the column normalization can be done with new_matrix = a / a.sum(axis=0, keepdims=1). Hope this can hep.

Snoopy
  • 138
  • 6
2

You could use built-in numpy function: np.linalg.norm(a, axis = 1, keepdims = True)

1

it appears that this also works

def normalizeRows(M):
    row_sums = M.sum(axis=1)
    return M / row_sums
Jamesszm
  • 101
  • 1
  • 10
0

You could also use matrix transposition:

(a.T / row_sums).T
Maciek
  • 762
  • 6
  • 17
0

Here is one more possible way using reshape:

a_norm = (a/a.sum(axis=1).reshape(-1,1)).round(3)
print(a_norm)

Or using None works too:

a_norm = (a/a.sum(axis=1)[:,None]).round(3)
print(a_norm)

Output:

array([[0.   , 0.333, 0.667],
       [0.25 , 0.333, 0.417],
       [0.286, 0.333, 0.381]])
Grayrigel
  • 3,474
  • 5
  • 14
  • 32
0

Use

a = a / np.linalg.norm(a, ord = 2, axis = 0, keepdims = True)

Due to the broadcasting, it will work as intended.

Moj
  • 2,872
  • 1
  • 13
  • 9
-1

Or using lambda function, like

>>> vec = np.arange(0,27,3).reshape(3,3)
>>> import numpy as np
>>> norm_vec = map(lambda row: row/np.linalg.norm(row), vec)

each vector of vec will have a unit norm.

XY.W
  • 104
  • 5
-1

We can achieve the same effect by premultiplying with the diagonal matrix whose main diagonal is the reciprocal of the row sums.

A = np.diag(A.sum(1)**-1) @ A
kimegitee
  • 11
  • 1
  • too inefficient. you turned a simple sum over all elements into a big (sparse) matrix multiplication – qwr Mar 30 '22 at 19:22
  • @qwr The original poster did not ask for a more efficient version, only a less "verbose" one. – kimegitee Dec 05 '22 at 19:02