I'm following Principal component analysis in Python to use PCA under Python, but am struggling with determining which features to choose (i.e. which of my columns/features have the best variance).
When I use scipy.linalg.svd
, it automatically sorts my Singular Values, so I can't tell which column they belong to.
Example code:
import numpy as np
from scipy.linalg import svd
M = [
[1, 1, 1, 1, 1, 1],
[3, 3, 3, 3, 3, 3],
[2, 2, 2, 2, 2, 2],
[9, 9, 9, 9, 9, 9]
]
M = np.transpose(np.array(M))
U,s,Vt = svd(M, full_matrices=False)
print s
Is there a different way to go about this without the Singular Values being sorted?
Update: It looks like this might not be possible, at least according to this post on the Matlab forums: http://www.mathworks.com/matlabcentral/newsreader/view_thread/241607. If anyone knows otherwise, let me know :)