0

I've got an array, called X, where every element is a 2d-vector itself. The diagonal of this array is filled with nothing but zero-vectors. Now I need to normalize every vector in this array, without changing the structure of it.

First I tried to calculate the norm of every vector and put it in an array, called N. After that I wanted to divide every element of X by every element of N. Two problems occured to me:

1) Many entries of N are zero, which is obviously a problem when I try to divide by them.

2) The shapes of the arrays don't match, so np.divide() doesn't work as expected.

Beyond that I don't think, that it's a good idea to calculate N like this, because later on I want to be able to do the same with more than two vectors.

import numpy as np

# Example array
X = np.array([[[0, 0], [1, -1]], [[-1, 1], [0, 0]]])
# Array containing the norms
N = np.vstack((np.linalg.norm(X[0], axis=1), np.linalg.norm(X[1], 
axis=1)))
R = np.divide(X, N)

I want the output to look like this:

R = np.array([[[0, 0], [0.70710678, -0.70710678]], [[-0.70710678, 0.70710678], [0, 0]]])
jerremaier
  • 23
  • 5
  • What is D, above? Can you provide a definition so I can try and run your code? – Hayden Eastwood Jun 26 '19 at 17:15
  • Similar to https://stackoverflow.com/questions/21030391/how-to-normalize-an-array-in-numpy – Buckeye14Guy Jun 26 '19 at 17:21
  • @jerremaier see my answer and let me know if it helps – seralouk Jun 26 '19 at 17:24
  • If dividing by the norm is all you wish to do but have cases where the norm is 0 you could use `f = lambda Y: Y / np.linalg.norm(Y) if np.linalg.norm(Y) else 0` and use the list comprehension from @serafeim. [Norm](https://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.norm.html) shouldn't be 0 unless the whole array is filled with 0s [or significantly small numbers :) ] – Buckeye14Guy Jun 26 '19 at 17:26
  • @HaydenEastwood D is supposed to be X, sorry. I fixed my mistake – jerremaier Jun 27 '19 at 10:07
  • @Buckeye14Guy Thank you very much! This is exactly what I was looking for and I really like the short solution. Is there a particular reason you use ```lambda``` instead of a function defintion? – jerremaier Jun 27 '19 at 14:00
  • @jerremaier NP. You can always use the `def f(Y): ...` pattern but I figured the whole function was a one liner. – Buckeye14Guy Jun 27 '19 at 14:25

1 Answers1

1

You do not need to use sklearn. Just define a function and then use list comprehension:

Assuming that the 0th dimension of the X is equal to the number of 2D arrays that you have, use this:

import numpy as np

# Example array
X = np.array([[[0, 0], [1, -1]], [[-1, 1], [0, 0]]])

def stdmtx(X):
    X= X - X.mean(axis =1)[:, np.newaxis]
    X= X / X.std(axis= 1, ddof=1)[:, np.newaxis]
    return np.nan_to_num(X)

R = np.array([stdmtx(X[i,:,:]) for i in range(X.shape[0])])

The desired output R:

array([[[ 0.        ,  0.        ],
        [ 0.70710678, -0.70710678]],

       [[-0.70710678,  0.70710678],
        [ 0.        ,  0.        ]]])
seralouk
  • 30,938
  • 9
  • 118
  • 133
  • Thanks very much! This really does the job! I just don't completely understand how you calculate the norm in your function. – jerremaier Jun 27 '19 at 13:50
  • I calculate the means and the standard deviations of each variable individually and then I remove the variable specific mean and divide by the variable specific std each variable – seralouk Jun 27 '19 at 13:54
  • Please upvote and accept my answer since it addresses correctly your question – seralouk Jun 28 '19 at 07:48