8

Sklearn PCA is pca.components_ the loadings? I am pretty sure it is, but I am trying to follow along a research paper and I am getting different results from their loadings. I can't find it within the sklearn documentation.

ashish trehan
  • 413
  • 1
  • 5
  • 9

2 Answers2

13

pca.components_ is the orthogonal basis of the space your projecting the data into. It has shape (n_components, n_features). If you want to keep the only the first 3 components (for instance to do a 3D scatter plot) of a datasets with 100 samples and 50 dimensions (also named features), pca.components_ will have shape (3, 50).

I think what you call the "loadings" is the result of the projection for each sample into the vector space spanned by the components. Those can be obtained by calling pca.transform(X_train) after calling pca.fit(X_train). The result will have shape (n_samples, n_components), that is (100, 3) for our previous example.

ogrisel
  • 39,309
  • 12
  • 116
  • 125
  • Transform applies dimensionality reduction according to the documentation so my vector has a different shape then the components.I am just trying to replicate a paper and its this one specifically http://ftp.utdallas.edu/~herve/Abdi-rotations-pretty.pdf I need the loadings to perform a Varimax rotation so I can build out table that has corresponding variables to each components. – ashish trehan Apr 05 '16 at 03:43
  • If you don't want to reduce the dimensionality you can just pass `n_components=n_features` to the PCA constructor (this is the default I think) and the results of the call to transform will have shape `(n_samples, n_features)` as well (assuming `n_samples > n_features)`. You can also choose to pass `whiten=True` or `whiten=False` (to the PCA constructor) to decide if you want to rescale the "loadings" to have unit variance or not. – ogrisel Apr 05 '16 at 11:10
  • Please read the source code of the class if you need more details on how transform works, it's not very complicated: https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/decomposition/pca.py – ogrisel Apr 05 '16 at 11:11
  • Thank you so much! I was working with it a little too superficially and dug into PCA technique a little more deeply. – ashish trehan Apr 05 '16 at 12:16
0

This previous answer is mostly correct except about the loadings. components_ is in fact the loadings, as the question asker originally stated. The result of the fit_transform function will give you the principal components (the transformed/reduced matrix).

Matt Thomas
  • 51
  • 1
  • 6