The difference between supervised and unsupervised learning when using PCA

Question

I have read the answer here. But, I can't apply it on one of my example so I probably still don't get it.

Here is my example: Suppose that my program is trying to learn PCA (principal component analysis). Or diagonalization process. I have a matrix, and the answer is it's diagonalization:

A = PDP^-1

If I understand correctly:

In supervised learning I will have all tries with it's errors

My question is:

What will I have in unsupervised learning?

Will I have error for each trial as I go along in trials and not all errors in advance? Or is it something else?

Possible duplicate of [What is the difference between supervised learning and unsupervised learning?](http://stackoverflow.com/questions/1832076/what-is-the-difference-between-supervised-learning-and-unsupervised-learning) — juanpa.arrivillaga, Feb 09 '17 at 16:50

Felix Glas · Answer 1 · 2017-02-09T17:05:19.790

3

First of all, PCA is neither used for classification, nor clustering. It is an analysis tool for data where you find the principal components in the data. This can be used for e.g. dimensionality reduction. Supervised and unsupervised learning has no relevance here.

However, PCA can often be applied to data before a learning algorithm is used.

In supervised learning, you have (as you say) a labeled set of data with "errors".

In unsupervised learning you don't have any labels, i.e, you can't validate anything at all. All you can do is to cluster the data somehow. The goal is often to achieve clusters that internally are more homogeneous. Success can be measured, e.g., using the within-cluster variance metric.

edited Feb 09 '17 at 17:05

answered Feb 09 '17 at 16:53

Felix Glas

15,065
7
53
82

But how can clustering help? maybe neither of the cluster sets is PCA results... I don't see what is the point in just clustering.. – user135172 Feb 10 '17 at 08:02
2

@user135172 Forget about PCA when talking about cluster analysis, it's apples and oranges. **Example of clustering:** let's say you're doing a music recommender system and you want to recommend music liked by users with similar taste. Then you could cluster all users with regard to their music preferences (_e.g._, using DBSCAN). Then when a single user is to be recommended music, you present all music that other people within the same cluster as the selected user like. This is unsupervised learning. – Felix Glas Feb 10 '17 at 16:39
I see, so, is it like indirect system? I mean, in this example of music recommender, how will supervised learning look like, will it be a system that directly checks music types instead of indirectly check same-cluster people? Is this the difference here between supervised and unsupervised learning? – user135172 Feb 12 '17 at 12:22
@user135172 If using supervised learning, we can turn this into a **classification problem**. We now want to recommend music that is similar to the music the user already likes. If we consider the user's liked music as positives, and all other music as negatives, then we have a labeled set that can be used for training. Now a classifier algorithm is trained using our training set, and will henceforth be able to classify _any_ music as positive or negative (liked or not liked). This is supervised learning. – Felix Glas Feb 12 '17 at 14:31

score 1 · Answer 2 · edited Sep 30 '17 at 10:40

Supervised Learning:

-> You give variously labeled example data as input along with correct answer.

-> This algorithm will learn form it and start predicting correct result based on input.

example: email spam filter

Unsupervised Learning:

-> You gave just data and don't tell anything like label or correct answer.

-> Algorithm automatically analyse pattern in the data.

example: google news

The difference between supervised and unsupervised learning when using PCA

2 Answers2