How to get all three SVD matrices with sklearn?

Question

Singular value decomposition of matrix M of size (M,N) means factoring

How to obtain all three matrices from scikit-learn and numpy package?

I think I can obtain Sigma with PCA model:

import numpy as np
from sklearn.decomposition import PCA

model = PCA(N, copy=True, random_state=0)
model.fit(X)

Sigma = model.singular_values_
Sigma = np.diag(singular_values)

What about other matrices?

score 2 · Answer 1 · answered Aug 24 '17 at 16:15

2

You can get these matrices using numpy.linalg.svd as follows:

a=np.array([[1,2,3],[4,5,6],[7,8,9]])
U, S, V = np.linalg.svd(a, full_matrices=True)

S is a 1D array that represents the diagonal entries in Sigma. U and V are the corresponding matrices from the decomposition.

By the way, note that when you used PCA, the data is centered before svd is applied (unlike numpy.linalg.svd, where svd is applied directly on the matrix itself. see lines 409-410 here).

answered Aug 24 '17 at 16:15

Miriam Farber

18,986
14
61
76

Is this equivalent to PCA when data is noisy? Can't `np.linalg.svd` just throw an exception while `PCA` estimator will still run? – Dims Aug 24 '17 at 17:16
1

I don't see why it should be equivalent. PCA code in GitHub also uses numpy.linalg.svd (this is shown in the link I provided in the answer). The only difference is that in the PCA code they use full_matrices=False and they center the data prior to the decomposition, but it is still the same function. – Miriam Farber Aug 24 '17 at 17:26

score 0 · Answer 2 · answered Jun 01 '18 at 13:25

Can't comment on Mirian's answer because I don't have enough reputation, but from looking at Miriam's link, sklearn actually calls scipy's linalg.svd which is doesn't seem to be the same as np.linalg.svd (discussion here)

So it may be better to use U, S, V = scipy.linalg.svd(a, full_matrices=True)

How to get all three SVD matrices with sklearn?

2 Answers2