0

I am working on principal components regression. My dataset consists of 150 variables and 60 observations. I am aware that I should have more observations than variables. I used PCA on my dataset. I have received 9 factors through PCA. Certain factors include variables with positive and negative loadings. After that I did a multiple regression with the factor scores and the dependent variable. Positive and negative regression coefficients also emerged there. My question is, how do I implement the factor loadings and regression coefficients in their combinations of positive and negative?

For example: factor 1 has regression coefficient -0.17, with var1 factor loading 0.4, var3 factor loading -0.3 and var7 factor loading -0.22. Factor 2 has regression coefficient 0.28, with var2 factor loading -0.21, var 3 factorloading 0.4 and var6 factor loading -0.3.

My goal is to create groups of my 150 variables, to give these groups a name and to be able to explain which groups cause a higher or lower value of y. I want to know if the variables from those groups then increase or decrease. So far I have standardized my x variables. I tested how many factors I had to use with parallels analysis and PCA applied with the following code:

nipals (xVars, a = 9)
scores <- (nipals (xVars, a = 9) $ T)
loadings <- (nipals (xVars, a = 9) $ P)

With the factor scores I apply regression analysis, where x1 to 9 are the scores of my factors. fit <- lm (y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9). The summary of my model gives the coefficients. How can I implement these coefficients with corresponding factor loadings?

Data:

y   x1  x2  x3  x4  x5
-1,392  0,033   4,471   0,038   0,148   2,208
2,740   0,066   52,836  0,041   0,526   0,186
-0,066  0,219   10,559  0,132   0,488   0,230

Factor loadings:

    F1  F2  F3  F4
1   0,10    0,07    0,16    0,08
2   0,05    -0,03   -0,01   -0,22
3   0,14    0,06    0,05    0,01
4   0,12    -0,08   -0,01   -0,03
5   0,12    -0,12   -0,03   0,07

0 Answers0