0

I have do a PCA on 10 stock Data of the Dow Jones, and now I try to extract a “stock index” factor from the stock data by using the first principal component of my PCA, but I don't how to do this.

library(FactoMineR);
str <- "Exxon Mobil;Intel;McDonald's;Microsoft;Nike;Procter And Gamble;The Travelers Companies;Verizon Communications;Visa;Wal-Mart Stores
84,46;30,81;96,29;40,72;99,55;82,32;107,11;48,92;65,18;80,71
85;31,27;97,44;40,66;100,33;81,94;108,13;48,63;65,41;82,25
85,63;31,46;97,88;40,96;100,89;82,72;109,64;49,12;65,66;82,53
83,58;32;96,96;40,97;99,88;82,31;107,13;48,56;65,54;81,35
84,32;30,08;97,64;41,21;99,33;82,15;106,83;48,42;65,59;81,89
84,86;29,89;98,14;41,46;98,99;83,01;107,61;48,73;65,73;81,32
84,52;30,79;99,36;42,9;100,65;83,92;109,23;49,41;67,1;83,05
85,43;31,2;98,62;42,86;101,46;84,86;109,62;49,64;67,08;83,31
84,54;31,31;97,05;42,88;101,98;84,74;109,73;49,56;67,41;83,24
84,41;30,74;95,98;42,29;98,32;83,38;109,11;49,3;66,81;81,52
86,07;30,89;97;42,5;97,51;83,75;109,52;49,54;267,67001;82,53
84,08;30,59;96,17;41,7;96,54;82,85;108,75;48,95;264,5;82,62
84,76;30,83;97,15;41,56;96,44;83,56;108,93;49,27;269,01999;83,29"

Actions <- read.table(text=str, dec="," , header=TRUE, sep=";")

Actions.PCA<-PCA(Actions)
summary(Actions.PCA)
IRTFM
  • 258,963
  • 21
  • 364
  • 487
Peter
  • 83
  • 1
  • 1
  • 10
  • 3
    You will need to show us a minimum reproducible example of what you have done, otherwise it will not be possible for people to help you. Please read. See http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – nico May 28 '15 at 13:54
  • 1
    In the future I suggest you post data with the `dput(Actions)` command. It is easier to parse. – Backlin May 28 '15 at 14:31
  • Or rather `dput(head(Actions))`. – Backlin May 28 '15 at 14:49

1 Answers1

3

Don't know how to do it using the FactoMineR package, but I do know how to do it using the built-in R function prcomp.

Parse data

str <- "Exxon Mobil;Intel;McDonalds;Microsoft;Nike;Procter And Gamble;The Travelers Companies;Verizon Communications;Visa;Wal-Mart Stores
84,46;30,81;96,29;40,72;99,55;82,32;107,11;48,92;65,18;80,71
85;31,27;97,44;40,66;100,33;81,94;108,13;48,63;65,41;82,25
85,63;31,46;97,88;40,96;100,89;82,72;109,64;49,12;65,66;82,53
83,58;32;96,96;40,97;99,88;82,31;107,13;48,56;65,54;81,35
84,32;30,08;97,64;41,21;99,33;82,15;106,83;48,42;65,59;81,89
84,86;29,89;98,14;41,46;98,99;83,01;107,61;48,73;65,73;81,32
84,52;30,79;99,36;42,9;100,65;83,92;109,23;49,41;67,1;83,05
85,43;31,2;98,62;42,86;101,46;84,86;109,62;49,64;67,08;83,31
84,54;31,31;97,05;42,88;101,98;84,74;109,73;49,56;67,41;83,24
84,41;30,74;95,98;42,29;98,32;83,38;109,11;49,3;66,81;81,52
86,07;30,89;97;42,5;97,51;83,75;109,52;49,54;267,67001;82,53
84,08;30,59;96,17;41,7;96,54;82,85;108,75;48,95;264,5;82,62
84,76;30,83;97,15;41,56;96,44;83,56;108,93;49,27;269,01999;83,29"

Actions <- read.table(str, header=TRUE, dec=",", sep=";")

Make the PCA

pca <- prcomp(Actions)

Get the first component

pca$x[,1]

Update

I think the real problem is that your file use comma as decimal separator instead of dot. Read it as text first without parsing it into a data frame (as read.csv2 does). Then convert commas, and run PCA.

Actions <- read.table("actions.csv", header=TRUE, dec=",", sep=";")
pca <- prcomp(Actions)
Backlin
  • 14,612
  • 2
  • 49
  • 81
  • When I do : Actions.PCA$x[,1] head(Actions.PCA$x) it return me NULL – Peter May 28 '15 at 14:02
  • I have the error " Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric " when i execute pca <- prcomp(Actions_parse). – Peter May 28 '15 at 14:43
  • Strange, it works perfectly for me. Am I correct in assuming you just posted a small part of your dataset (and rightly so!) and that `Actions_parse` is the complete dataset? In that case I guess that there was some problem importing the data and that one column isn't a numeric. If it is a data frame can you please run and post `lapply(Action_paste, class)`? – Backlin May 28 '15 at 14:48
  • Oh ! So it works when put data directly in R with str but when i do Actions<-read.csv2("actions.csv") Actions_res <- read.table(textConnection(gsub(",", ".", Actions)), header=TRUE, sep=";") and pca <- prcomp(Actions_res) it don't work. – Peter May 28 '15 at 14:54
  • Think I got it now. Updated the answer. – Backlin May 28 '15 at 14:59
  • Ok ! So with str <- readLines("actions.csv") it works now ! It can put comma as decimal or dot it works for both. With pca$x[,1] I have just the first principal component right ? – Peter May 28 '15 at 15:19
  • Yes. The columns in `pca$x` represent the different components. First column is first component. – Backlin May 28 '15 at 15:36
  • Thanks !! do you know how to do a plot with 2 graphs on the same plot ? – Peter May 28 '15 at 16:16
  • 1
    Use `Actions <- read.table(text=str, dec="," , header=TRUE, sep=";")` Which does turn out to be the settings in read.csv2 which the OP used. – IRTFM May 28 '15 at 20:10
  • @Peter, yes I do know how to make such a plot but that has been answered lots of times before, for example [here](http://stackoverflow.com/questions/11489447/combining-two-plots-in-r), [here](http://stackoverflow.com/questions/1801064/how-to-separate-two-plots-in-r), or [here](http://stackoverflow.com/questions/29439576/how-to-combine-two-plots-in-r), so I'm sure you could find the answer with a little searching. – Backlin Jun 01 '15 at 06:32