Many functions can perform Principal Component Analysis (PCA) on raw data in R. By raw data I understand any data frame or matrix whose rows are indexed by observations and whose columns are identified with measurements. Can we carry out PCA on a correlation matrix in R ? Which function can accept a correlation matrix as its input in R ?
Asked
Active
Viewed 2,395 times
1
-
2Have a look at this [question](https://stackoverflow.com/questions/21832254/pca-analysis-using-correlation-matrix-as-input-in-r). `princomp` can take a `covmat` input argument (but it's the covariance matrix not the correlation matrix) instead of the initial dataframe. – Lamia Nov 17 '18 at 16:21
1 Answers
3
As mentioned in the comments, you can use
ii <- as.matrix(iris[,1:4])
princomp(covmat=cor(ii))
This will give you equivalent results to princomp(iris,cor=TRUE)
(which is not what you want - the latter uses the full data matrix, but returns the value computed when the covariance matrix is converted to a correlation).
You can also do all the relevant computations by hand if you have the correlation matrix:
cc <- cor(ii)
e1 <- eigen(cc)
Standard deviations:
sqrt(e1$values)
[1] 1.7083611 0.9560494 0.3830886 0.1439265
Proportion of variance:
e1$values/sum(e1$values)
[1] 0.729624454 0.228507618 0.036689219 0.005178709
You can get the loadings via e1$vectors
. Compute the scores (according to this CV question) via as.matrix(iris) %*% e1$vectors)
(this will not give numerically identical answers to princomp()$scores
- the eigenvectors are scaled differently - but it gives equivalent results).

Ben Bolker
- 211,554
- 25
- 370
- 453
-
If use the command res.pca=princomp(covmat=cor(iris)) I can not visualize the results by the factominer package. Namely, i use the command fviz_pca_ind(res.pca) and I obtain the aswer "The object res.pca doesn't have the element scores. Please use the function princomp() with the argument scores = TRUE". Hence, I include the argument scores in the princomp function but I still receive this aswer. – Piotr Wilczek Nov 17 '18 at 18:36
-
1You're trying to visualize the distribution of the scores of a PCA without access to the original data ... what exactly do you want the graph to show you if you don't have the original data ... ?? – Ben Bolker Nov 17 '18 at 18:52