0

Dataset I am working on looks like-DATA there are 6 different countries and r_1..r_13 specifies the reasons. I want to apply PCA on this dataset to find out the significant reasons for each country The question I want to ask is how can I run PCA for each country without reading file for each country instead I want to read the entire file as shown above. Also please check the code I am using for doing PCA-

    pca<-prcomp(numeric,center=T,scale=T)
    summary(pca)
    eigen_val<-pca$sdev ^2
    sum(eigen_val)
    prop_var<-round(eigen_val/sum(eigen_val),4)
    round(sum(prop_var[1:13]),4)
    load<-pca$rotation

After computing rotation matrix I will check which PC's are most correlated with which observed variables and accordingly I will decide the significance of the variables.(on the basis of- more than no. of PC's correlated with variable more is the significance of the variable) Kindly suggest whether the approach is correct or not ! Thanks!!

Kavya
  • 31
  • 1
  • 6
  • Welcome to stackOverflow. This question is short on details (code and data). Please take a look at these tips on creating a [minimum example](http://stackoverflow.com/help/mcve). That being said, gregor's post on [working with a list of data.frames](http://stackoverflow.com/questions/17499013/how-do-i-make-a-list-of-data-frames) gives the R best practices using `split` and `lapply` to answer your question. – lmo Jun 25 '16 at 13:14

1 Answers1

0

Here's a simple starting point for a solution that you can tweak to get the results in your desired format. Let's assume you're working with the iris dataset in R, and you want to do pca for each Species, kind of like how you want to do pca by each country in your data.

library(caret)
data(iris)
Iris <- split(iris, iris$Species)
for(i in 1:length(Iris)){
  assign(paste0("pca", i), prcomp(Iris[[i]][which(names(iris)!="Species")], center=T, scale.=T))
}
Gaurav Bansal
  • 5,221
  • 14
  • 45
  • 91
  • Thanks alott Gaurav!! Can you please suggest whether my approach is correct or not ? Also how can I save the results of for loop instead of printing it on screen and how can I compute top 6 and bottom most related elements from the rotations matrix of each country/Species. – Kavya Jun 26 '16 at 10:02
  • It seems like your approach is fine. I edited my code above to store `pca` instead of printing it. You can use the same `assign` approach to do more computations inside the `for` loop and store them in variables. To see the most important elements of the rotations, you can use the `summary` command. This link might help with the details: http://www.r-bloggers.com/computing-and-visualizing-pca-in-r/ – Gaurav Bansal Jun 27 '16 at 13:29
  • Can Someone suggest if I am working on various files stored in a list then how can I split those files country wise(Variable wise). – Kavya Jul 07 '16 at 11:21