I'm quite new to machine learning and just got introduced to principle component analysis as a dimensionality reduction method. What I don't understand, in which circumstances is PCA any better than simply removing some features from the model? If the aim is to obtain lower dimensional data, why don't we just group those features that are correlated and retain one single feature from each group?
Asked
Active
Viewed 1,329 times
3
-
1This is a good question, but it's better suited to [CrossValidated](http://stats.stackexchange.com), which is the stats/ML sibling to StackOverflow. – Matt Parker Nov 19 '15 at 23:24
1 Answers
3
There is a fundamental difference between feature reduction (such as PCA) and feature selection (which you describe). The crucial difference is that feature reduction (PCA) maps your data to lower dimensional through some projection of all original dimensions, for example PCA uses linear combination of each. So final data embedding has information from all features. If you perform feature selection you discard information, you completely loose anything that was present there. Furthermore, PCA guarantees you retaining given fraction of the data variance.

lejlot
- 64,777
- 8
- 131
- 164
-
As far as I understood, with PCA we eliminate dimensions that are correlated, ie linearly dependent. That said, projecting all those dimensions doesn't seem to retain any more information than just dropping them... Am I missing something? – Botond Nov 19 '15 at 23:37
-
1That has nothing to do with PCA. Pca looks for a linear projection which preserves most of the variance. It does not "eliminate" any dimensions. – lejlot Nov 19 '15 at 23:48